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RELATED APPLICATIONS 

This application claims the benefit of U.S. Provisional Application Serial No. 
60/153,357, filed September 10, 2000, U.S. Provisional Application Serial No. 
5 60/220,947, filed July 26, 2000, and U.S. Provisional Application Serial No. 

60/225,724, filed August 16, 2000, the entire teachings of all of which are incorporated 
herein by reference. 



BACKGROUND OF THE INVENTION 

The genomes of all organisms undergo spontaneous mutation in the course of 

10 their continuing evolution, generating variant forms of progenitor nucleic acid 

sequences (Gusella, Ann. Rev. Biochem. 55, 831-854 (1986)). The variant form may 
confer an evolutionary advantage or disadvantage relative to a progenitor form, or may 
be neutral. In some instances, a variant form confers a lethal disadvantage and is not 
transmitted to subsequent generations of the organism. In other instances, a variant 

15 form confers an evolutionary advantage to the species and is eventually incorporated 
into the DNA of many or most members of the species and effectively becomes the 
progenitor form. In many instances, both progenitor and variant form(s) survive and co- 
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exist in a species population. The coexistence of multiple forms of a sequence gives 
rise to polymorphisms. 

Several different types of polymorphism have been reported. A restriction 
fragment length polymorphism (RFLP) is a variation in DNA sequence that alters the 
5 length of a restriction fragment (Botstein et al., Am. J. Hum. Genet. 32, 314-331 
(1980)). The restriction fragment length polymorphism may create or delete a 
restriction site, thus changing the length of the restriction fragment. RFLPs have been 
widely used in human and animal genetic analyses (see WO 90/13668; W090/1 1369; 
Donis-Keller, Cell 51, 319-337 (1987); Lander et a/., Genetics 121, 85-99 (1989)). 
10 When a heritable trait can be linked to a particular RFLP, the presence of the RFLP in 
an individual can be used to predict the likelihood that the animal will also exhibit the 
trait. 

Other polymorphisms take the form of short tandem repeats (STRs) that include 
tandem di-, tri- and tetra-nucleotide repeated motifs. These tandem repeats are also 

15 referred to as variable number tandem repeat (VNTR) polymorphisms. VNTRs have 
been used in identity and paternity analysis (US 5,075,217; Armour et aL, FEBSLett. 
307, 113-115 (1992); Horn et ah, W0 91/14003; Jeffreys, EP 370,719), and in a large 
number of genetic mapping studies. 

Other polymorphisms take the form of single nucleotide variations between 

20 individuals of the same species. Such polymorphisms are far more frequent than 
RFLPs, STRs and VNTRs. Some single nucleotide polymorphisms (SNP) occur in 
protein-coding nucleic acid sequences (coding sequence SNP (cSNP)), in which case, 
one of the polymorphic forms may give rise to the expression of a defective or 
otherwise variant protein and, potentially, a genetic disease. Examples of genes in 

25 which polymorphisms within coding sequences give rise to genetic disease include P- 
globin (sickle cell anemia), apoE4 (Alzheimer's Disease), Factor V Leiden (thrombosis), 
and CFTR (cystic fibrosis). cSNPs can alter the codon sequence of the gene and 
therefore specify an alternative amino acid. Such changes are called "missense" when 
another amino acid is substituted, and "nonsense" when the alternative codon specifies a 
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stop signal in protein translation. When the cSNP does not alter the amino acid 
specified the cSNP is called "silent". 

Other single nucleotide polymorphisms occur in noncoding regions. Some of 
these polymorphisms may also result in defective protein expression (e.g., as a result of 
5 defective splicing). Other single nucleotide polymorphisms have no phenotypic effects. 
Single nucleotide polymorphisms can be used in the same manner as RFLPs and 
VNTRs, but offer several advantages. Single nucleotide polymorphisms occur with 
greater frequency and are spaced more uniformly throughout the genome than other 
forms of polymorphism. The greater frequency and uniformity of single nucleotide 

10 polymorphisms means that there is a greater probability that such a polymorphism will 
be found in close proximity to a genetic locus of interest than would be the case for 
other polymorphisms. The different forms of characterized single nucleotide 
polymorphisms are often easier to distinguish than other types of polymorphism (e.g., 
by use of assays employing allele-specific hybridization probes or primers). 

15 Only a small percentage of the total repository of polymorphisms in humans and 

other organisms has been identified. The limited number of polymorphisms identified 
to date is due to the large amount of work required for their detection by conventional 
methods. For example, a conventional approach to identifying polymorphisms might be 
to sequence the same stretch of DNA in a population of individuals by dideoxy 

20 sequencing. In this type of approach, the amount of work increases in proportion to 
both the length of sequence and the number of individuals in a population and becomes 
impractical for large stretches of DNA or large numbers of persons. 



SUMMARY OF THE INVENTION 

Work described herein pertains to the identification of polymorphisms which can 
25 predispose individuals to disease, by resequencing large numbers of genes in a large 
number of individuals. Various genes from a number of individuals have been 
resequenced as described herein, and SNPs in these genes have been discovered (see the 
Table and Fig. 3). Some of these SNPs are cSNPs which specify a different amino acid 



2825.1027-001 




-4- 



sequence, some of the SNPs are silent cSNPs and some of these cSNPs specify a stop 
signal in protein translation. Some of the identified SNPs were located in non-coding 
regions. 

The invention relates to a gene which comprises a single nucleotide 
5 polymorphism at a specific location. In a particular embodiment the invention relates to 
the variant allele of a gene having a single nucleotide polymorphism, which variant 
allele differs from a reference allele by one nucleotide at the site(s) identified in the 
Table and Fig. 3. Complements of these nucleic acid sequences are also included. The 
nucleic acid molecules can be DNA or RNA, and can be double- or single-stranded. 
10 Nucleic acid molecules can be, for example, 5-10, 5-15, 10-20, 5-25, 10-30, 10-50 or 
10-100 bases long. 

The invention further provides allele-specific oligonucleotides that hybridize to 
the reference or variant allele of a gene comprising a single nucleotide polymorphism or 
to the complement thereof. These oligonucleotides can be probes or primers. 

1 5 The invention further provides a method of analyzing a nucleic acid from an 

individual. The method determines which base is present at any one of the polymorphic 
sites shown in the Table and/or Fig. 3. Optionally, a set of bases occupying a set of the 
polymorphic sites shown in the Table and /or Fig. 3 is determined. This type of analysis 
can be performed on a number of individuals, who are tested for the presence of a 

20 disease phenotype. The presence or absence of disease phenotype is then correlated 
with a base or set of bases present at the polymorphic site or sites in the individuals 
tested. 

Thus, the invention further relates to a method of predicting the presence, 
absence, likelihood of the presence or absence, or severity of a particular phenotype or 
25 disorder associated with a particular genotype. The method comprises obtaining a 
nucleic acid sample from an individual and determining the identity of one or more 
bases (nucleotides) at polymorphic sites of genes described herein, wherein the presence 
of a particular base is correlated with a specified phenotype or disorder, thereby 
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predicting the presence, absence, likelihood of the presence or absence, or severity of 
the phenotype or disorder in the individual. 

The thrombospondins are a family of extracellular matrix (ECM) glycoproteins 
that modulate many cell behaviors including adhesion, migration, and proliferation. 
5 Thrombospondins (also known as thrombin sensitive proteins or TSPs) are large 
molecular weight glycoproteins composed of three identical disulfide-linked 
polypeptide chains. The results described herein also reveal an important association 
between alterations, particularly SNPs, in TSP genes, particularly TSP-1 and TSP-4, 
and vascular disease. In particular, SNPs in these genes which are associated with 
10 premature coronary artery disease (CAD)(or coronary heart disease) and myocardial 

infarction (MI) have been identified and represent a potentially vital marker of upstream 
biology influencing the complex process of atherosclerotic plaque generation and 
vulnerability. 

Thus, the invention relates to the TSP gene SNPs identified as described herein, 
15 both singly and in combination, as well as to the use of these SNPs, and others in TSP 
genes, particularly those nearby in linkage disequilibrium with these SNPs, for 
diagnosis, prediction of clinical course and treatment response for vascular disease, 
development of new treatments for vascular disease based upon comparison of the 
variant and normal versions of the gene or gene product, and development of cell- 
20 culture based and animal models for research and treatment of vascular disease. The 
invention further relates to novel compounds and pharmaceutical compositions for use 
in the diagnosis and treatment of such disorders. In preferred embodiments, the 
vascular disease is CAD or MI. 

The invention relates to isolated nucleic acid molecules comprising all or a 
25 portion of the variant allele of TSP-1 (e.g., as exemplified by SEQ ID NO: 1), and to 
isolated nucleic acid molecules comprising all or a portion of the variant allele of TSP-4 
(e.g., as exemplified by SEQ ID NO: 3). Preferred portions are at least 10 contiguous 
nucleotides and comprise the polymorphic site, e.g., a portion of SEQ ID NO: 1 which 
is at least 10 contiguous nucleotides and comprises the "G" at position 2210, or a 
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portion of SEQ ID NO: 3 which is at least 10 contiguous nucleotides and comprises the 
"C" at position 1 186. The invention further relates to isolated gene products, e.g., 
polypeptides or proteins, which are encoded by a nucleic acid molecule comprising all 
or a portion of the variant allele of TSP-1 or TSP-4 (e.g., SEQ ID NO: 1 or SEQ ID NO: 
5 3, respectively). The invention also relates to nucleic acid molecules which hybridize to 
and/or share identity with the variant alleles identified herein (or their complements) 
and which also comprise the variant nucleotide at the SNP site. 

The invention further relates to isolated proteins or polypeptides comprising all or 
a portion of the variant amino acid sequence of TSP-1 (e.g., as exemplified by SEQ ID 

10 NO: 2), and to isolated proteins or polypeptides comprising all or a portion of the 
variant amino acid sequence of TSP-4 (e.g., as exemplified by SEQ ID NO: 4). 
Preferred polypeptides are at least 10 contiguous amino acids and comprise the 
polymorphic amino acid, e.g., a portion of SEQ ID NO: 2 which is at least 10 
contiguous amino acids and comprises the serine at residue 700, or a portion of SEQ ID 

15 NO: 4 which is at least 10 contiguous amino acids and comprises the proline at residue 
387. The invention further relates to isolated nucleic acid molecules encoding such 
proteins and polypeptides, as well as to antibodies which bind, e.g., specifically, to such 
proteins and polypeptides. 

The invention further relates to a method of diagnosing or aiding in the diagnosis 

20 of a disorder associated with the presence of one or more of (a) a G at nucleotide 

position 2210 of SEQ ID NO: 1; or (b) a C at nucleotide position 1186 of SEQ ID NO: 
3 in an individual. The method comprises obtaining a nucleic acid sample from the 
individual and determining the nucleotide present at one or more of the indicated 
nucleotide positions, wherein presence of one or more of (a) a G at nucleotide position 

25 2210 of SEQ ID NO: 1; or (b) a C at nucleotide position 1186 of SEQ ID NO: 3 is 

indicative of increased likelihood of said disorder in the individual as compared with an 
appropriate control, e.g., an individual having the reference nucleotide at one or more of 
said positions. In a particular embodiment the disorder is a vascular disease selected 
from the group consisting of atherosclerosis, coronary heart or artery disease, MI, 
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stroke, peripheral vascular diseases, venous thromboembolism and pulmonary 
embolism. In a preferred embodiment, the vascular disease is selected from the group 
consisting of CAD and ML 

The invention further relates to a method of diagnosing or aiding in the diagnosis 
5 of a disorder associated with one or more of (a) a G at nucleotide position 2210 of SEQ 
ID NO: 1; or (b) a C at nucleotide position 1186 of SEQ ID NO: 3 in an individual. The 
method comprises obtaining a nucleic acid sample from the individual and determining 
the nucleotide present at one or more of the indicated nucleotide positions, wherein 
presence of one or more of (a) an A at nucleotide position 2210 of SEQ ID NO: 1; or (b) 

10 a G at nucleotide position 1 1 86 of SEQ ID NO: 3 is indicative of decreased likelihood 
of said disorder in the individual as compared with an appropriate control, e.g., an 
individual having the variant nucleotide at said position. In a particular embodiment the 
disorder is a vascular disease selected from the group consisting of atherosclerosis, 
coronary heart or artery disease, MI, stroke, peripheral vascular diseases, venous 

1 5 thromboembolism and pulmonary embolism. In a preferred embodiment, the vascular 
disease is selected from the group consisting of CAD and ML 

In one embodiment, the invention relates to a method for predicting the likelihood 
that an individual will have a vascular disease (or aiding in the diagnosis of a vascular 
disease), comprising the steps of obtaining a DNA sample from an individual to be 

20 assessed and determining the nucleotide present at one or more of nucleotide positions 
2210 of SEQ ID NO: 1 or 1186 of SEQ ID NO: 3. The presence of the reference 
nucleotide at one or more of these positions indicates that the individual has a lower 
likelihood of having a vascular disease than an individual having the variant nucleotide 
at one or more of these positions, or a lower likelihood of having severe symptomology. 

25 In a particular embodiment, the individual is an individual at risk for development of a 
vascular disease. 

The invention further relates to a method of diagnosing or aiding in the diagnosis 
of a disorder associated with the presence of one or more of (a) a serine at amino acid 
position 700 of SEQ ID NO: 2; or (b) a proline at amino acid position 387 of SEQ ID 
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NO: 4 in an individual. The method comprises obtaining a biological sample containing 
the TSP-1 and/or TSP-4 protein or relevant portion thereof from the individual and 
determining the amino acid present at one or more of the indicated amino acid positions, 
wherein presence of one or more of (a) a serine at amino acid position 700 of SEQ ID 
5 NO: 2; or (b) a proline at amino acid position 387 of SEQ ID NO: 4 is indicative of 
increased likelihood of said disorder in the individual as compared with an appropriate 
control, e.g., an individual having the reference amino acid at one or more of said 
positions. 

The invention further relates to a method of diagnosing or aiding in the diagnosis 
10 of a disorder associated with one or more of (a) a serine at amino acid position 700 of 
SEQ ID NO: 2; or (b) a proline at amino acid position 387 of SEQ ID NO: 4 in an 
individual. The method comprises obtaining a biological sample containing the TSP-1 
and/or TSP-4 protein or relevant portion thereof from the individual and determining the 
amino acid present at one or more of the indicated amino acid positions, wherein 
1 5 presence of one or more of (a) an asparagine at amino acid position 700 of SEQ ID NO: 
2; or (b) an alanine at amino acid position 387 of SEQ ID NO: 4 is indicative of 
decreased likelihood of said disorder in the individual as compared with an appropriate 
control, e.g., an individual having the variant amino acid at one or more of said 
positions. 

20 In one embodiment, the invention relates to a method for predicting the likelihood 

that an individual will have a vascular disease (or aiding in the diagnosis of a vascular 
disease), comprising the steps of obtaining a biological sample comprising the TSP-1 
and/or TSP-4 protein or relevant portion thereof from an individual to be assessed and 
determining the amino acid present at one or more of amino acid positions 700 of SEQ 

25 ID NO: 2 or 387 of SEQ ID NO: 4. The presence of the reference amino acid at one or 
more of these positions indicates that the individual has a lower likelihood of having a 
vascular disease than an individual having the variant amino acid at one or more of 
these positions, or a lower likelihood of having severe symptomology. In a particular 
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embodiment, the individual is an individual at risk for development of a vascular 
disease. 

In another embodiment, the invention relates to pharmaceutical compositions 
comprising a reference TSP-1 and/or TSP-4 gene or gene product, or active portion 
5 thereof, for use in the treatment of vascular diseases. The invention further relates to the 
use of agonists and antagonists of TSP-1 and TSP-4 activity for use in the treatment of 
vascular diseases. In a particular embodiment the vascular disease is selected from the 
group consisting of atherosclerosis, coronary heart or artery disease, MI, stroke, 
peripheral vascular diseases, venous thromboembolism and pulmonary embolism. In a 
10 preferred embodiment, the vascular disease is selected from the group consisting of 
CAD and ML 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figs. 1 A-1D show the reference nucleotide (SEQ ID NO: 1) and amino acid (SEQ 
ID NO: 2) sequences for TSP-1. 

15 Figs. 2A-2C show the reference nucleotide (SEQ ID NO: 3) and amino acid (SEQ 

ID NO: 4) sequences for TSP-4. 

Fig. 3 shows a table providing detailed information about the SNPs identified 
herein. Column one shows the internal polymorphism identifier. Column two shows 
the accession number for the reference sequence in the TIGR database 

20 (http://www.tigr.org/tdb/hgi/searching/hgi__reports.html). Column three shows the 
nucleotide position for the SNP iste. Column four shows the gene in which the 
polymorphism was identified. Column five shows the polymorphic site and additional 
flanking sequence on each side of the polymorphism. Column six shows the type of 
mutation produced by the polymorphism. Columns seven and eight show the reference 

25 and alternate (variant) nucleotides, respectively, for the SNP. Columns nine and ten 
show the reference and alternate (variant) amino acids, respectively, encoded by the 
alleles of the gene. 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to a gene which comprises a single nucleotide 
polymorphism (SNP) at a specific location. The gene which includes the SNP has at 
least two alleles, referred to herein as the reference allele and the variant allele. The 
5 reference allele (prototypical or wild type allele) has been designated arbitrarily and 
typically corresponds to the nucleotide sequence of the gene which has been deposited 
with GenBank or TIGR under a given Accession number. The variant allele differs 
from the reference allele by one nucleotide at the site(s) identified in the Table. The 
present invention also relates to variant alleles of the described genes and to 

10 complements of the variant alleles. The invention also relates to nucleic acid molecules 
which hybridize to and/or share identity with the variant alleles identified herein (or 
their complements) and which also comprise the variant nucleotide at the SNP site. 

The invention further relates to portions of the variant alleles and portions of 
complements of the variant alleles which comprise (encompass) the site of the SNP and 

15 are at least 5 nucleotides in length. Portions can be, for example, 5-10, 5-15, 10-20, 5- 
25, 10-30, 10-50 or 10-100 bases long. For example, a portion of a variant allele which 
is 21 nucleotides in length includes the single nucleotide polymorphism (the nucleotide 
which differs from the reference allele at that site) and twenty additional nucleotides 
which flank the site in the variant allele. These nucleotides can be on one or both sides 

20 of the polymorphism. Polymorphisms which are the subject of this invention are 
defined in the Table with respect to the reference sequence deposited in GenBank or 
TIGR under the Accession number indicated. For example, the invention relates to a 
portion of a gene (e.g., AT3) having a nucleotide sequence as deposited in GenBank 
(e.g., Ul 1270) comprising a single nucleotide polymorphism at a specific position (e.g., 

25 nucleotide 11918). The reference nucleotide for AT3 is shown in column 8, and the 
variant nucleotide is shown in column 9 of the Table. The nucleotide sequences of the 
invention can be double- or single-stranded. 
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The invention further provides allele-specific oligonucleotides that hybridize to 
the reference or variant allele of a gene comprising a single nucleotide polymorphism or 
to the complement thereof. These oligonucleotides can be probes or primers. 

The invention further provides a method of analyzing a nucleic acid from an 
5 individual. The method determines which base is present at any one of the polymorphic 
sites shown in the Table and/or Fig. 3. Optionally, a set of bases occupying a set of the 
polymorphic sites shown in the Table and/or Fig. 3 is determined. This type of analysis 
can be performed on a number of individuals, who are tested for the presence of a 
disease phenotype. The presence or absence of disease phenotype is then correlated 
10 with a base or set of bases present at the polymorphic site or sites in the individuals 
tested. 

Thus, the invention further relates to a method of predicting the presence, 
absence, likelihood of the presence or absence, or severity of a particular phenotype or 
disorder associated with a particular genotype. The method comprises obtaining a 
15 nucleic acid sample from an individual and determining the identity of one or more 

bases (nucleotides) at polymorphic sites of genes described herein, wherein the presence 
of a particular base is correlated with a specified phenotype or disorder, thereby 
predicting the presence, absence, likelihood of the presence or absence, or severity of 
the phenotype or disorder in the individual. 



20 DEFINITIONS 

A nucleic acid molecule or oligonucleotide; can be DNA or RNA, and single- or 
double-stranded. Nucleic acid molecules and oligonucleotides can be naturally 
occurring or synthetic, but are typically prepared by synthetic means. Preferred nucleic 
acid molecules and oligonucleotides of the invention include segments of DNA, or their 

25 complements, which include any one of the polymorphic sites shown in the Table. The 
segments can be between 5 and 250 bases, and, in specific embodiments, are between 5- 
10, 5-20, 10-20, 10-50, 20-50 or 10-100 bases. For example, the segment can be 21 
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bases. The polymorphic site can occur within any position of the segment. The 
segments can be from any of the allelic forms of DNA shown in the Table. 

As used herein, the terms "nucleotide", "base" and "nucleic acid" are intended to 
be equivalent. The terms "nucleotide sequence", "nucleic acid sequence", "nucleic acid 
5 molecule" and "segment" are intended to be equivalent. 

Hybridization probes are oligonucleotides which bind in a base-specific manner to 
a complementary strand of nucleic acid. Such probes include peptide nucleic acids, as 
described in Nielsen et al, 9 Science 254, 1497-1500 (1991). Probes can be any length 
suitable for specific hybridization to the target nucleic acid sequence. The most 

10 appropriate length of the probe may vary depending upon the hybridization method in 
which it is being used; for example, particular lengths may be more appropriate for use 
in microfabricated arrays, while other lengths may be more suitable for use in classical 
hybridization methods. Such optimizations are known to the skilled artisan. Suitable 
probes and primers can range from about 5 nucleotides to about 30 nucleotides in 

15 length. For example, probes and primers can be 5, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 
25, 26, 28 or 30 nucleotides in length. The probe or primer preferably overlaps at least 
one polymorphic site occupied by any of the possible variant nucleotides. The 
nucleotide sequence can correspond to the coding sequence of the allele or to the 
complement of the coding sequence of the allele. 

20 As used herein, the term "primer" refers to a single-stranded oligonucleotide 

which acts as a point of initiation of template-directed DNA synthesis under appropriate 
conditions (e.g., in the presence of four different nucleoside triphosphates and an agent 
for polymerization, such as DNA or RNA polymerase or reverse transcriptase) in an 
appropriate buffer and at a suitable temperature. The appropriate length of a primer 

25 depends on the intended use of the primer, but typically ranges from 15 to 30 

nucleotides. Short primer molecules generally require cooler temperatures to form 
sufficiently stable hybrid complexes with the template. A primer need not reflect the 
exact sequence of the template, but must be sufficiently complementary to hybridize 
with a template. The term primer site refers to the area of the target DNA to which a 
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primer hybridizes. The term primer pair refers to a set of primers including a 5' 
(upstream) primer that hybridizes with the 5' end of the DNA sequence to be amplified 
and a 3' (downstream) primer that hybridizes with the complement of the 3' end of the 
sequence to be amplified. 
5 As used herein, linkage describes the tendency of genes, alleles, loci or genetic 

markers to be inherited together as a result of their location on the same chromosome. 
It can be measured by percent recombination between the two genes, alleles, loci or 
genetic markers. 

As used herein, polymorphism refers to the occurrence of two or more genetically 

10 determined alternative sequences or alleles in a population. A polymorphic marker or 
site is the locus at which divergence occurs. Preferred markers have at least two alleles, 
each occurring at frequency of greater than 1%, and more preferably greater than 10% 
or 20% of a selected population. A polymorphic locus may be as small as one base pair. 
Polymorphic markers include restriction fragment length polymorphisms, variable 

15 number of tandem repeats (VNTR's), hypervariable regions, minisatellites, dinucleotide 
repeats, trinucleotide repeats, tetranucleotide repeats, simple sequence repeats, and 
insertion elements such as Alu. The first identified allelic form is arbitrarily designated 
as the reference form and other allelic forms are designated as alternative or variant 
alleles. The allelic form occurring most frequently in a selected population is 

20 sometimes referred to as the wildtype form. Diploid organisms may be homozygous or 
heterozygous for allelic forms. A diallelic or biallelic polymorphism has two forms. A 
triallelic polymorphism has three forms. 

Work described herein pertains to the resequencing of large numbers of genes in a 
large number of individuals to identify polymorphisms which can predispose 

25 individuals to disease. For example, polymorphisms in genes which are expressed in 
liver may predispose individuals to disorders of the liver. By altering amino acid 
sequence, SNPs may alter the function of the encoded proteins. The discovery of the 
SNP facilitates biochemical analysis of the variants and the development of assays to 
characterize the variants and to screen for pharmaceutical that would interact directly 
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with on or another form of the protein. SNPs (including silent SNPs) also enable the 
development of specific DNA, RNA, or protein-based diagnostics that detect the 
presence or absence of the polymorphism in particular conditions. 

A single nucleotide polymorphism occurs at a polymorphic site occupied by a 
5 single nucleotide, which is the site of variation between allelic sequences. The site is 
usually preceded by and followed by highly conserved sequences of the allele (e.g., 
sequences that vary in less than 1/100 or 1/1000 members of the populations). 

A single nucleotide polymorphism usually arises due to substitution of one 
nucleotide for another at the polymorphic site. A transition is the replacement of one 

10 purine by another purine or one pyrimidine by another pyrimidine. A trans version is the 
replacement of a purine by a pyrimidine or vice versa. Single nucleotide 
polymorphisms can also arise from a deletion of a nucleotide or an insertion of a 
nucleotide relative to a reference allele. Typically the polymorphic site is occupied by a 
base other than the reference base. For example, where the reference allele contains the 

15 base "T" at the polymorphic site, the altered allele can contain a "C", "G M or "A" at the 
polymorphic site. 

The invention also relates to nucleic acid molecules which hybridize to the variant 
alleles identified herein (or their complements) and which also comprise the variant 
nucleotide at the SNP site. Hybridizations are usually performed under stringent 

20 conditions, for example, at a salt concentration of no more than 1 M and a temperature 
of at least 25°C. For example, conditions of 5X SSPE (750 mM NaCl, 50 mM 
NaPhosphate, 5 mM EDTA, pH 7.4) and a temperature of 25-30°C, or equivalent 
conditions, are suitable for allele-specific probe hybridizations. Equivalent conditions 
can be determined by varying one or more of the parameters given as an example, as 

25 known in the art, while maintaining a similar degree of identity or similarity between 
the target nucleotide sequence and the primer or probe used. 

The invention also relates to nucleic acid molecules which share substantial 
sequence identity to the variant alleles identified herein (or their complements) and 
which also comprise the variant nucleotide at the SNP site. Particularly preferred are 
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nucleic acid molecules and fragments which have at least about 60%, preferably at least 
about 70, 80 or 85%, more preferably at least about 90%, even more preferably at least 
about 95%, and most preferably at least about 98% identity with nucleic acid molecules 
described herein. The percent identity of two nucleotide or amino acid sequences can 
5 be determined by aligning the sequences for optimal comparison purposes (e.g., gaps 
can be introduced in the sequence of a first sequence). The nucleotides or amino acids 
at corresponding positions are then compared, and the percent identity between the two 
sequences is a function of the number of identical positions shared by the sequences 
(i.e., % identity = # of identical positions/total # of positions x 100). In certain 

10 embodiments, the length of a sequence aligned for comparison purposes is at least 30%, 
preferably at least 40%, more preferably at least 60%, and even more preferably at least 
70%, 80% or 90% of the length of the reference sequence. The actual comparison of the 
two sequences can be accomplished by well-known methods, for example, using a 
mathematical algorithm. A preferred, non-limiting example of such a mathematical 

15 algorithm is described in Karlin et aL, Proc. Natl Acad. ScL USA, 90:5873-5877 

(1993). Such an algorithm is incorporated into the NBLAST and XBLAST programs 
(version 2.0) as described in Altschul et at, Nucleic Acids Res., 25:389-3402 (1997). 
When utilizing BLAST and Gapped BLAST programs, the default parameters of the 
respective programs (e.g., NBLAST) can be used. See http://www.ncbi.nlm.nih.gov. In 

20 one embodiment, parameters for sequence comparison can be set at score=100, 
wordlength=12, or can be varied (e.g., W~5 or W=20). 

The term "isolated" is used herein to indicate that the material in question exists in 
a physical milieu distinct from that in which it occurs in nature. For example, an 
isolated nucleic acid of the invention may be substantially isolated with respect to the 

25 complex cellular milieu in which it naturally occurs. In some instances, the isolated 
material will form part of a composition (for example, a crude extract containing other 
substances), buffer system or reagent mix. In other circumstance, the material may be 
purified to essential homogeneity, for example as determined by PAGE or column 
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chromatography such as HPLC. Preferably, an isolated nucleic acid comprises at least 
about 50, 80 or 90 percent (on a molar basis) of all macromolecular species present. 

I. Novel Polymorphisms of the Invention 

Some of the novel polymorphisms of the invention are shown in the Table. 
5 Columns one and two show designations for the indicated polymorphism. Column 
three shows the Genbank or TIGR Accession number for the wild type (or reference) 
allele. Column four shows the location of the polymorphic site in the nucleic acid 
sequence with reference to the Genbank or TIGR sequence shown in column three. 
Column five shows common names for the gene in which the polymorphism is located. 

10 Column six shows the polymorphism and a portion of the 3* and 5 f flanking sequence of 
the gene. Column seven shows the type of mutation; N, non-sense, S, silent, M, 
missense. Columns eight and nine show the reference and alternate nucleotides, 
respectively, at the polymorphic site. Columns ten and eleven show the reference and 
alternate amino acids, respectively, encoded by the reference and variant, respectively, 

15 alleles. Other novel polymorphisms of the invention are shown in Fig. 3. 

II. Analysis of Polymorphisms 

A. Preparation of Samples 

Polymorphisms are detected in a target nucleic acid from an individual being 
analyzed. For assay of genomic DNA, virtually any biological sample (other than pure 

20 red blood cells) is suitable. For example, convenient tissue samples include whole 

blood, semen, saliva, tears, urine, fecal material, sweat, buccal, skin and hair. For assay 
of cDNA or mRNA, the tissue sample must be obtained from an organ in which the 
target nucleic acid is expressed. For example, if the target nucleic acid is a cytochrome 
P450, the liver is a suitable source. 

25 Many of the methods described below require amplification of DNA from target 

samples. This can be accomplished by e.g., PCR. See generally PCR Technology: 
Principles and Applications for DNA Amplification (ed. H. A. Erlich, Freeman Press, 
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NY, NY, 1992); PCR Protocols: A Guide to Methods and Applications (eds. Innis, et 
aL, Academic Press, San Diego, CA, 1990); Mattila et aL, Nucleic Acids Res. 19, 4967 
(1991); Eckert et aL, PCR Methods and Applications 1, 17 (1991); PCR (eds. 
McPherson et al, IRL Press, Oxford); and U.S. Patent 4,683,202. 
5 Other suitable amplification methods include the ligase chain reaction (LCR) (see 

Wu and Wallace, Genomics 4, 560 (1989), Landegren et aL, Science 241, 1077 (1988), 
transcription amplification (Kwoh et aL, Proc. Natl Acad. ScL USA 86, 1173 (1989)), 
and self-sustained sequence replication (Guatelli et aL, Proc. Nat Acad. Sci. USA, 87, 
1874 (1990)) and nucleic acid based sequence amplification (NASBA). The latter two 
10 amplification methods involve isothermal reactions based on isothermal transcription, 
which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) 
as the amplification products in a ratio of about 30 or 100 to 1, respectively. 

B. Detection of Polymorphisms in Target DNA 

There are two distinct types of analysis of target DNA for detecting 

15 polymorphisms. The first type of analysis, sometimes referred to as de novo 

characterization, is carried out to identify polymorphic sites not previously characterized 
(i.e., to identify new polymorphisms). This analysis compares target sequences in 
different individuals to identify points of variation, i.e., polymorphic sites. By 
analyzing groups of individuals representing the greatest ethnic diversity among humans 

20 and greatest breed and species variety in plants and animals, patterns characteristic of 
the most common alleles/haplotypes of the locus can be identified, and the frequencies 
of such alleles/haplotypes in the population can be determined. Additional allelic 
frequencies can be determined for subpopulations characterized by criteria such as 
geography, race, or gender. The de novo identification of polymorphisms of the 

25 invention is described in the Examples section. The second type of analysis determines 
which form(s) of a characterized (known) polymorphism are present in individuals 
under test. There are a variety of suitable procedures, which are discussed in turn. 
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1 . Allele-Specific Probes 

The design and use of allele-specific probes for analyzing polymorphisms is 
described by e.g., Saiki et al. 9 Nature 324, 163-166 (1986); Dattagupta, EP 235,726, 
Saiki, WO 89/1 1548. Allele-specific probes can be designed that hybridize to a 
5 segment of target DNA from one individual but do not hybridize to the corresponding 
segment from another individual due to the presence of different polymorphic forms in 
the respective segments from the two individuals. Hybridization conditions should be 
sufficiently stringent that there is a significant difference in hybridization intensity 
between alleles, and preferably an essentially binary response, whereby a probe 
10 hybridizes to only one of the alleles. Some probes are designed to hybridize to a 
segment of target DNA such that the polymorphic site aligns with a central position 
(e.g., in a 15-mer at the 7 position; in a 16-mer, at either the 8 or 9 position) of the 
probe. This design of probe achieves good discrimination in hybridization between 
different allelic forms. 

15 Allele-specific probes are often used in pairs, one member of a pair showing a 

perfect match to a reference form of a target sequence and the other member showing a 
perfect match to a variant form. Several pairs of probes can then be immobilized on the 
same support for simultaneous analysis of multiple polymorphisms within the same 
target sequence. 

20 2. Tiling Arrays 

The polymorphisms can also be identified by hybridization to nucleic acid arrays, 
some examples of which are described in WO 95/1 1995. One form of such arrays is 
described in the Examples section in connection with de novo identification of 
polymorphisms. The same array or a different array can be used for analysis of 

25 characterized polymorphisms. WO 95/1 1995 also describes subarrays that are 

optimized for detection of a variant form of a precharacterized polymorphism. Such a 
subarray contains probes designed to be complementary to a second reference sequence, 
which is an allelic variant of the first reference sequence. The second group of probes is 
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designed by the same principles as described in the Examples, except that the probes 
exhibit complementarity to the second reference sequence. The inclusion of a second 
group (or further groups) can be particularly useful for analyzing short subsequences of 
the primary reference sequence in which multiple mutations are expected to occur 
5 within a short distance commensurate with the length of the probes (e.g., two or more 
mutations within 9 to 21 bases). 



3. Allele-Specific Primers 

An allele-specific primer hybridizes to a site on target DNA overlapping a 
polymorphism and only primes amplification of an allelic form to which the primer 

10 exhibits perfect complementarity. See Gibbs, Nucleic Acid Res. 17, 2427-2448 (1989). 
This primer is used in conjunction with a second primer which hybridizes at a distal 
site. Amplification proceeds from the two primers, resulting in a detectable product 
which indicates the particular allelic form is present. A control is usually performed 
with a second pair of primers, one of which shows a single base mismatch at the 

1 5 polymorphic site and the other of which exhibits perfect complementarity to a distal 
site. The single-base mismatch prevents amplification and no detectable product is 
formed. The method works best when the mismatch is included in the 3'-most position 
of the oligonucleotide aligned with the polymorphism because this position is most 
destabilizing to elongation from the primer (see, e.g., WO 93/22456). 



20 4. Direct-Sequencing 

The direct analysis of the sequence of polymorphisms of the present invention can 
be accomplished using either the dideoxy chain termination method or the Maxam - 
Gilbert method (see Sambrook et aL, Molecular Cloning, A Laboratory Manual (2nd 
Ed., CSHP, New York 1989); Zyskind et aL, Recombinant DNA Laboratory Manual, 

25 (Acad. Press, 1988)). 
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5. Denaturing Gradient Gel Electrophoresis 

Amplification products generated using the polymerase chain reaction can be 
analyzed by the use of denaturing gradient gel electrophoresis. Different alleles can be 
identified based on the different sequence-dependent melting properties and 
5 electrophoretic migration of DNA in solution. Erlich, ed., PCR Technology, Principles 
and Applications for DNA Amplification, (W.H. Freeman and Co, New York, 1992), 
Chapter 7, 

6. Single-Strand Conformation Polymorphism Analysis 

Alleles of target sequences can be differentiated using single-strand conformation 
10 polymorphism analysis, which identifies base differences by alteration in 

electrophoretic migration of single stranded PCR products, as described in Orita et aL, 
Proc. Nat. Acad. Sci. 86, 2766-2770 (1989). Amplified PCR products can be generated 
as described above, and heated or otherwise denatured, to form single stranded 
amplification products. Single-stranded nucleic acids may refold or form secondary 
15 structures which are partially dependent on the base sequence. The different 

electrophoretic mobilities of single-stranded amplification products can be related to 
base-sequence differences between alleles of target sequences. 

7. Single-Base Extension 

An alternative method for identifying and analyzing polymorphisms is based on 
20 single-base extension (SBE) of a fluorescently-labeled primer coupled with fluorescence 
resonance energy transfer (FRET) between the label of the added base and the label of 
the primer. Typically, the method, such as that described by Chen et aL, (PNAS 
94: 10756-61 (1997), incorporated herein by reference) uses a locus-specific 
oligonucleotide primer labeled on the 5' terminus with 5 -carboxy fluorescein (FAM). 
25 This labeled primer is designed so that the 3' end is immediately adjacent to the 

polymorphic site of interest. The labeled primer is hybridized to the locus, and single 
base extension of the labeled primer is performed with fluorescently labeled 
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dideoxyribonucleotides (ddNTPs) in dye-terminator sequencing fashion, except that no 
deoxyribonucleotides are present. An increase in fluorescence of the added ddNTP in 
response to excitation at the wavelength of the labeled primer is used to infer the 
identity of the added nucleotide. 

5 III. Methods of Use 

After determining polymorphic form(s) present in an individual at one or more 
polymorphic sites, this information can be used in a number of methods. 

A. Forensics 

Determination of which polymorphic forms occupy a set of polymorphic sites in 
10 an individual identifies a set of polymorphic forms that distinguishes the individual. 
See generally National Research Council, The Evaluation of Forensic DNA Evidence 
(Eds. Pollard et aL, National Academy Press, DC, 1996). The more sites that are 
analyzed, the lower the probability that the set of polymorphic forms in one individual 
is the same as that in an unrelated individual. Preferably, if multiple sites are analyzed, 
15 the sites are unlinked. Thus, polymorphisms of the invention are often used in 

conjunction with polymorphisms in distal genes. Preferred polymorphisms for use in 
forensics are biallelic because the population frequencies of two polymorphic forms can 
usually be determined with greater accuracy than those of multiple polymorphic forms 
at multi-allelic loci. 

20 The capacity to identify a distinguishing or unique set of forensic markers in an 

individual is useful for forensic analysis. For example, one can determine whether a 
blood sample from a suspect matches a blood or other tissue sample from a crime scene 
by determining whether the set of polymorphic forms occupying selected polymorphic 
sites is the same in the suspect and the sample. If the set of polymorphic markers does 

25 not match between a suspect and a sample, it can be concluded (barring experimental 
error) that the suspect was not the source of the sample. If the set of markers does 
match, one can conclude that the DNA from the suspect is consistent with that found at 
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the crime scene. If frequencies of the polymorphic forms at the loci tested have been 
determined (e.g., by analysis of a suitable population of individuals), one can perform a 
statistical analysis to determine the probability that a match of suspect and crime scene 
sample would occur by chance. 
5 p(ID) is the probability that two random individuals have the same polymorphic 

or allelic form at a given polymorphic site. In biallelic loci, four genotypes are possible: 
AA, AB, BA, and BB. If alleles A and B occur in a haploid genome of the organism 
with frequencies x and y, the probability of each genotype in a diploid organism is (see 
WO 95/12607): 
1 0 Homozygote: p(AA)= x 2 

Homozygote: p(BB)= y 2 = (1-x) 2 

Single Heterozygote: p(AB)= p(BA)= xy = x(l-x) 

Both Heterozygotes: p(AB+BA)= 2xy = 2x(l-x) 

The probability of identity at one locus (Le, the probability that two individuals, 
15 picked at random from a population will have identical polymorphic forms at a given 
locus) is given by the equation: 
p(ID)-(x 2 ) 2 + (2xy) 2 + (y 2 ) 2 . 

These calculations can be extended for any number of polymorphic forms at a 
given locus. For example, the probability of identity p(ID) for a 3-allele system where 
20 the alleles have the frequencies in the population of x, y and z, respectively, is equal to 
the sum of the squares of the genotype frequencies: 
p(ID) - x 4 + (2xy) 2 + (2yz) 2 + (2xz) 2 + z 4 + y 4 

In a locus of n alleles, the appropriate binomial expansion is used to calculate 
p(ID) and p(exc). 

25 The cumulative probability of identity (cum p(ID)) for each of multiple unlinked 

loci is determined by multiplying the probabilities provided by each locus. 
cump(ID) = p(IDl)p(ID2)p(ID3),... p(IDn) 
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The cumulative probability of non-identity for n loci (i.e. the probability that two 
random individuals will be different at 1 or more loci) is given by the equation: 
cum p(nonID) = 1 -cum p(ID). 

If several polymorphic loci are tested, the cumulative probability of non-identity 
5 for random individuals becomes very high (e.g., one billion to one). Such probabilities 
can be taken into account together with other evidence in determining the guilt or 
innocence of the suspect. 



B. Paternity Testing 

The object of paternity testing is usually to determine whether a male is the father 
10 of a child. In most cases, the mother of the child is known and thus, the mother's 

contribution to the child's genotype can be traced. Paternity testing investigates whether 
the part of the child's genotype not attributable to the mother is consistent with that of 
the putative father. Paternity testing can be performed by analyzing sets of 
polymorphisms in the putative father and the child. 
15 If the set of polymorphisms in the child attributable to the father does not match 

the set of polymorphisms of the putative father, it can be concluded, barring 
experimental error, that the putative father is not the real father. If the set of 
polymorphisms in the child attributable to the father does match the set of 
polymorphisms of the putative father, a statistical calculation can be performed to 
20 determine the probability of coincidental match. 

The probability of parentage exclusion (representing the probability that a random 
male will have a polymorphic form at a given polymorphic site that makes him 
incompatible as the father) is given by the equation (see WO 95/12607): 
p(exc) = xy(l-xy) 

25 where x and y are the population frequencies of alleles A and B of a biallelic 
polymorphic site. 

(At a triallelic site p(exc) - xy(l-xy) + yz(l- yz) + xz(l-xz)+ 3xyz(l-xyz))), where 
x, y and z and the respective population frequencies of alleles A, B and C). 
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The probability of non-exclusion is 
p(non-exc) = l-p(exc) 

The cumulative probability of non-exclusion (representing the value obtained 
when n loci are used) is thus: 
5 cum p(non-exc) = p(non-excl)p(non-exc2)p(non-exc3).... p(non-excn) 

The cumulative probability of exclusion for n loci (representing the probability 
that a random male will be excluded) 

cum p(exc) = 1 - cum p(non-exc). 

If several polymorphic loci are included in the analysis, the cumulative 
1 0 probability of exclusion of a random male is very high. This probability can be taken 
into account in assessing the liability of a putative father whose polymorphic marker set 
matches the child's polymorphic marker set attributable to his/her father. 

C. Correlation of Polymorphisms with Phenotypic Traits 

The polymorphisms of the invention may contribute to the phenotype of an 

15 organism in different ways. Some polymorphisms occur within a protein coding 

sequence and contribute to phenotype by affecting protein structure. The effect may be 
neutral, beneficial or detrimental, or both beneficial and detrimental, depending on the 
circumstances. For example, a heterozygous sickle cell mutation confers resistance to 
malaria, but a homozygous sickle cell mutation is usually lethal. Other polymorphisms 

20 occur in noncoding regions but may exert phenotypic effects indirectly via influence on 
replication, transcription, and translation. A single polymorphism may affect more than 
one phenotypic trait. Likewise, a single phenotypic trait may be affected by 
polymorphisms in different genes. Further, some polymorphisms predispose an 
individual to a distinct mutation that is causally related to a certain phenotype. 

25 Phenotypic traits include diseases that have known but hitherto unmapped genetic 

components (e.g., agammaglobulimenia, diabetes insipidus, Lesch-Nyhan syndrome, 
muscular dystrophy, Wiskott-Aldrich syndrome, Fabry's disease, familial 
hypercholesterolemia, polycystic kidney disease, hereditary spherocytosis, von 
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Willebrand f s disease, tuberous sclerosis, hereditary hemorrhagic telangiectasia, familial 
colonic polyposis, Ehlers-Danlos syndrome, osteogenesis imperfecta, and acute 
intermittent porphyria). Phenotypic traits also include symptoms of, or susceptibility to, 
multifactorial diseases of which a component is or may be genetic, such as autoimmune 
5 diseases, inflammation, cancer, diseases of the nervous system, and infection by 
pathogenic microorganisms. Some examples of autoimmune diseases include 
rheumatoid arthritis, multiple sclerosis, diabetes (insulin-dependent and non- 
independent), systemic lupus erythematosus and Graves disease. Some examples of 
cancers include cancers of the bladder, brain, breast, colon, esophagus, kidney, 
10 leukemia, liver, lung, oral cavity, ovary, pancreas, prostate, skin, stomach and uterus. 
Phenotypic traits also include characteristics such as longevity, appearance (e.g., 
baldness, obesity), strength, speed, endurance, fertility, and susceptibility or receptivity 
to particular drugs or therapeutic treatments. 



1 5 facilitated by knowledge of the gene product of the wild type (reference) gene. The 
genes in which cSNPs of the present invention have been identified are genes which 
have been previously sequenced and characterized in one of their allelic forms. 

Correlation is performed for a population of individuals who have been tested for 
the presence or absence of a phenotypic trait of interest and for polymorphic markers 

20 sets. To perform such analysis, the presence or absence of a set of polymorphisms (i.e. 
a polymorphic set) is determined for a set of the individuals, some of whom exhibit a 
particular trait, and some of which exhibit lack of the trait. The alleles of each 
polymorphism of the set are then reviewed to determine whether the presence or 
absence of a particular allele is associated with the trait of interest. Correlation can be 

25 performed by standard statistical methods such as a K-squared test and statistically 

significant correlations between polymorphic form(s) and phenotypic characteristics are 
noted. For example, it might be found that the presence of allele Al at polymorphism A 
correlates with heart disease. As a further example, it might be found that the combined 



The correlation of one or more polymorphisms with phenotypic traits can be 
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presence of allele Al at polymorphism A and allele Bl at polymorphism B correlates 
with increased milk production of a farm animal. 

Such correlations can be exploited in several ways. In the case of a strong 
correlation between a set of one or more polymorphic forms and a disease for which 
5 treatment is available, detection of the polymorphic form set in a human or animal 
patient may justify immediate administration of treatment, or at least the institution of 
regular monitoring of the patient. Detection of a polymorphic form correlated with 
serious disease in a couple contemplating a family may also be valuable to the couple in 
their reproductive decisions. For example, the female partner might elect to undergo in 

10 vitro fertilization to avoid the possibility of transmitting such a polymorphism from her 
husband to her offspring. In the case of a weaker, but still statistically significant 
correlation between a polymorphic set and human disease, immediate therapeutic 
intervention or monitoring may not be justified. Nevertheless, the patient can be 
motivated to begin simple life-style changes (e.g., diet, exercise) that can be 

15 accomplished at little cost to the patient but confer potential benefits in reducing the risk 
of conditions to which the patient may have increased susceptibility by virtue of variant 
alleles. Identification of a polymorphic set in a patient correlated with enhanced 
receptiveness to one of several treatment regimes for a disease indicates that this 
treatment regime should be followed. 

20 For animals and plants, correlations between characteristics and phenotype are 

useful for breeding for desired characteristics. For example, Beitz et al., US 5,292,639 
discuss use of bovine mitochondrial polymorphisms in a breeding program to improve 
milk production in cows. To evaluate the effect of mtDNA D-loop sequence 
polymorphism on milk production, each cow was assigned a value of 1 if variant or 0 if 

25 wildtype with respect to a prototypical mitochondrial DNA sequence at each of 17 
locations considered. Each production trait was analyzed individually with the 
following animal model: 

Y ijkpn = [X + YS, + + X k + p , + . . . p ]7 + PE n + a„ +e p 
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where Y ijknp is the milk, fat, fat percentage, SNF, SNF percentage, energy concentration, 
or lactation energy record; |i is an overall mean; YSj is the effect common to all cows 
calving in year-season; X k is the effect common to cows in either the high or average 
selection line; Pj to P 17 are the binomial regressions of production record on mtDNA D- 
5 loop sequence polymorphisms; PE n is permanent environmental effect common to all 
records of cow n; a n is effect of animal n and is composed of the additive genetic 
contribution of sire and dam breeding values and a Mendelian sampling effect; and e p is 
a random residual. It was found that eleven of seventeen polymorphisms tested 
influenced at least one production trait. Bovines having the best polymorphic forms for 
10 milk production at these eleven loci are used as parents for breeding the next generation 
of the herd. 

D. Genetic Mapping of Phenotypic Traits 

The previous section concerns identifying correlations between phenotypic traits 
and polymorphisms that directly or indirectly contribute to those traits. The present 

15 section describes identification of a physical linkage between a genetic locus associated 
with a trait of interest and polymorphic markers that are not associated with the trait, but 
are in physical proximity with the genetic locus responsible for the trait and co- 
segregate with it. Such analysis is useful for mapping a genetic locus associated with a 
phenotypic trait to a chromosomal position, and thereby cloning gene(s) responsible for 

20 the trait. See Lander et al, Proc. Natl. Acad Set (USA) 83, 7353-7357 (1986); Lander 
et aL 9 Proc. Natl. Acad. ScL (USA) 84, 2363-2367 (1987); Donis-Keller et al, Cell 51, 
319-337 (1987); Lander et ah, Genetics 121, 185-199 (1989)). Genes localized by 
linkage can be cloned by a process known as directional cloning. See Wainwright, Med. 
J. Australia 159, 170-174 (1993); Collins, Nature Genetics 1, 3-6 (1992). 

25 Linkage studies are typically performed on members of a family. Available 

members of the family are characterized for the presence or absence of a phenotypic 
trait and for a set of polymorphic markers. The distribution of polymorphic markers in 
an informative meiosis is then analyzed to determine which polymorphic markers co- 
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segregate with a phenotypic trait. See, e.g., Kerem et aL, Science 245, 1073-1080 

(1989) ; Monaco et aL, Nature 316, 842 (1985); Yamoka et aL, Neurology 40, 222-226 

(1990) ; Rossiter et aL, FASEB Journal 5, 21-27 (1991). 

Linkage is analyzed by calculation of LOD (log of the odds) values. A lod value 
5 is the relative likelihood of obtaining observed segregation data for a marker and a 
genetic locus when the two are located at a recombination fraction 0, versus the 
situation in which the two are not linked, and thus segregating independently 
(Thompson & Thompson, Genetics in Medicine (5th ed, W.B. Saunders Company, 
Philadelphia, 1991); Strachan, "Mapping the human genome" in The Human Genome 

10 (BIOS Scientific Publishers Ltd, Oxford), Chapter 4). A series of likelihood ratios are 
calculated at various recombination fractions (8), ranging from 0 = 0.0 (coincident loci) 
to 0 = 0.50 (unlinked). Thus, the likelihood at a given value of 0 is: probability of data 
if loci linked at 0 to probability of data if loci unlinked. The computed likelihoods are 
usually expressed as the log 10 of this ratio (i.e., a lod score). For example, a lod score 

15 of 3 indicates 1000:1 odds against an apparent observed linkage being a coincidence. 
The use of logarithms allows data collected from different families to be combined by 
simple addition. Computer programs are available for the calculation of lod scores for 
differing values of 0 (e.g., LIPED, MLINK (Lathrop, Proa Nat. Acad. Set (USA) 81, 
3443-3446 (1984)). For any particular lod score, a recombination fraction may be 

20 determined from mathematical tables. See Smith et aL, Mathematical tables for 
research workers in human genetics (Churchill, London, 1961); Smith, Ann. Hum. 
Genet. 32, 127-150 (1968). The value of 0 at which the lod score is the highest is 
considered to be the best estimate of the recombination fraction. 

Positive lod score values suggest that the two loci are linked, whereas negative 

25 values suggest that linkage is less likely (at that value of 0) than the possibility that the 
two loci are unlinked. By convention, a combined lod score of +3 or greater (equivalent 
to greater than 1000:1 odds in favor of linkage) is considered definitive evidence that 
two loci are linked. Similarly, by convention, a negative lod score of -2 or less is taken 
as definitive evidence against linkage of the two loci being compared. Negative linkage 
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data are useful in excluding a chromosome or a segment thereof from consideration. 
The search focuses on the remaining non-excluded chromosomal locations. 

IV. Modified Polypeptides and Gene Sequences 

The invention further provides variant forms of nucleic acids and corresponding 
5 proteins. The nucleic acids comprise one of the sequences described in the Table, 

column 5, in which the polymorphic position is occupied by one of the alternative bases 
for that position. Some nucleic acids encode full-length variant forms of proteins. 
Similarly, variant proteins have the prototypical amino acid sequences encoded by 
nucleic acid sequences shown in the Table, column 5, (read so as to be in- frame with the 

10 full-length coding sequence of which it is a component) except at an amino acid 

encoded by a codon including one of the polymorphic positions shown in the Table. 
That position is occupied by the amino acid coded by the corresponding codon in any of 
the alternative forms shown in the Table. 

Variant genes can be expressed in an expression vector in which a variant gene is 

15 operably linked to a native or other promoter. Usually, the promoter is a eukaryotic 
promoter for expression in a mammalian cell. The transcription regulation sequences 
typically include a heterologous promoter and optionally an enhancer which is 
recognized by the host. The selection of an appropriate promoter, for example trp, lac, 
phage promoters, glycolytic enzyme promoters and tRNA promoters, depends on the 

20 host selected. Commercially available expression vectors can be used. Vectors can 

include host-recognized replication systems, amplifiable genes, selectable markers, host 
sequences useful for insertion into the host genome, and the like. 

The means of introducing the expression construct into a host cell varies 
depending upon the particular construction and the target host. Suitable means include 

25 fusion, conjugation, transfection, transduction, electroporation or injection, as described 
in Sambrook, supra. A wide variety of host cells can be employed for expression of the 
variant gene, both prokaryotic and eukaryotic. Suitable host cells include bacteria such 
as E. coli ? yeast, filamentous fungi, insect cells, mammalian cells, typically 
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immortalized, e.g., mouse, CHO, human and monkey cell lines and derivatives thereof. 
Preferred host cells are able to process the variant gene product to produce an 
appropriate mature polypeptide. Processing includes glycosylation, ubiquitination, 
disulfide bond formation, general post-translational modification, and the like. As used 
5 herein, "gene product" includes mRNA, peptide and protein products. 

The protein may be isolated by conventional means of protein biochemistry and 
purification to obtain a substantially pure product, i.e., 80, 95 or 99% free of cell 
component contaminants, as described in Jacoby, Methods in Enzymology Volume 104, 
Academic Press, New York (1984); Scopes, Protein Purification, Principles and 

10 Practice, 2nd Edition, Springer- Verlag, New York (1987); and Deutscher (ed), Guide to 
Protein Purification, Methods in Enzymology, Vol. 182 (1990). If the protein is 
secreted, it can be isolated from the supernatant in which the host cell is grown. If not 
secreted, the protein can be isolated from a lysate of the host cells. 

The invention further provides transgenic nonhuman animals capable of 

15 expressing an exogenous variant gene and/or having one or both alleles of an 

endogenous variant gene inactivated. Expression of an exogenous variant gene is 
usually achieved by operably linking the gene to a promoter and optionally an enhancer, 
and microinjecting the construct into a zygote. See Hogan et aL, "Manipulating the 
Mouse Embryo, A Laboratory Manual," Cold Spring Harbor Laboratory. Inactivation 

20 of endogenous variant genes can be achieved by forming a transgene in which a cloned 
variant gene is inactivated by insertion of a positive selection marker. See Capecchi, 
Science 244, 1288-1292 (1989). The transgene is then introduced into an embryonic 
stem cell, where it undergoes homologous recombination with an endogenous variant 
gene. Mice and other rodents are preferred animals. Such animals provide useful drug 

25 screening systems. 

In addition to substantially full-length polypeptides expressed by variant genes, 
the present invention includes biologically active fragments of the polypeptides, or 
analogs thereof, including organic molecules which simulate the interactions of the 
peptides. Biologically active fragments include any portion of the full-length 
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polypeptide which confers a biological function on the variant gene product, including 
ligand binding, and antibody binding. Ligand binding includes binding by nucleic 
acids, proteins or polypeptides, small biologically active molecules, or large cellular 
structures. 

5 Polyclonal and/or monoclonal antibodies that specifically bind to variant gene 

products but not to corresponding prototypical gene products are also provided. 
Antibodies can be made by injecting mice or other animals with the variant gene 
product or synthetic peptide fragments thereof. Monoclonal antibodies are screened as 
are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, Cold 

10 Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, Principles and 
Practice (2d ed.) Academic Press, New York (1986). Monoclonal antibodies are tested 
for specific immunoreactivity with a variant gene product and lack of immunoreactivity 
to the corresponding prototypical gene product. These antibodies are useful in 
diagnostic assays for detection of the variant form, or as an active ingredient in a 

15 pharmaceutical composition. 



V. Kits 

The invention further provides kits comprising at least one allele-specific 
oligonucleotide as described herein. Often, the kits contain one or more pairs of allele- 
specific oligonucleotides hybridizing to different forms of a polymorphism. In some 

20 kits, the allele-specific oligonucleotides are provided immobilized to a substrate. For 
example, the same substrate can comprise allele-specific oligonucleotide probes for 
detecting at least 10, 100 or all of the polymorphisms shown in the Table. Optional 
additional components of the kit include, for example, restriction enzymes, reverse- 
transcriptase or polymerase, the substrate nucleoside triphosphates, means used to label 

25 (for example, an avidin-enzyme conjugate and enzyme substrate and chromogen if the 
label is biotin), and the appropriate buffers for reverse transcription, PCR, or 
hybridization reactions. Usually, the kit also contains instructions for carrying out the 
methods. 
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The thrombospondins are a family of extracellular matrix (ECM) glycoproteins 
that modulate many cell behaviors including adhesion, migration, and proliferation. 
Thrombospondins (also known as thrombin sensitive proteins or TSPs) are large 
molecular weight glycoproteins composed of three identical disulfide-linked 
5 polypeptide chains, TSPs are stored in the alpha-granules of platelets and secreted by a 
variety of mesenchymal and epithelial cells (Majack et al. 9 Cell Membrane 3:51-11 
(1987)). Platelets secrete TSPs when activated in the blood by such physiological 
agonists such as thrombin. TSPs have lectin properties and a broad function in the 
regulation of fibrinolysis and as a component of the ECM, and are one of a group of 

10 ECM proteins which have adhesive properties. TSPs bind to fibronectin and fibrinogen 
(Lahav et ah, EurJBiochem 745:151-6 (1984)), and these proteins are known to be 
involved in platelet adhesion to substratum and platelet aggregation (Leung, J Clin 
Invest 74:1764-1772 (1986)). 

Recent work has implicated TSPs in response of cells to growth factors. 

15 Submitogenic doses of PDGF induce a rapid but transitory, increase in TSP synthesis 
and secretion by rat aortic smooth muscle cells (Majack et al. 9 J Biol Chem 101:1059-10 
(1985)). PDGF responsiveness to TSP synthesis in glial cells has also been shown 
(Asch et aU P™c Natl Acad Sci 55:2904-8 (1986)). TSP mRNA levels rise rapidly in 
response to PDGF (Majack et al., J Biol Chem 262:8821-5 (1987)). TSPs act 

20 synergistically with epidermal growth factor to increase DNA synthesis in smooth 
muscle cells (Majack et aL, Proc Natl Acad Sci 53:9050-4 (1986)), and monoclonal 
antibodies to TSPs inhibit smooth muscle cell proliferation (Majack et al., J Biol Chem 
106:415-22 (1988)). TSPs modulate local adhesions in endothelial cells, and TSPs, 
particularly TSP-1 primarily derived from platelet granules, are known to be an 

25 important activator of transforming growth factor beta-1 (TGFB-1) (Crawford et ah, 
Cell 93:1 159 (1998)) and appear to be a potential link between platelet-thrombosis and 
development of atherosclerosis. 

To determine pivotal genes associated with premature coronary artery disease, we 
analyzed DNA from 347 patients with MI or coronary revascularization before age 40 
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(men) or 45 (women) and 422 general population controls. Cases were drawn (one per 
family) from a retrospective collection of sibling pairs with premature CAD. Controls 
were ascertained through random-digit dialing. Both cases and controls were 
Caucasian. A complete database of phenotypic and laboratory variables for the affected 
5 patients afforded logistic regression to control for age, diabetes, body mass index, 
gender. 

Thrombospondin (TSP) 4 and 1 emerged as important SNPs associated with 
premature CAD and ML For CAD, 148 of 347 patients carried at least one copy of the 
TSP-4 variant compared with 142 of 422 control subjects; adjusted odds ratio 1,47, 
10 p=0.01 . For premature MI, the association was even stronger: 91 of 1 87 cases vs. 142 
of 422 controls had the variant; adjusted odds ratio 2.08, p=0.0003. The TSP-1 SNP 
was rare. Nonetheless, homozygosity for the variant allele gave an adjusted odds ratio 
of9.5,p-04. 

Specific reference nucleotide (SEQ ID NO: 1) and amino acid (SEQ ID NO: 2) 
15 sequences for TSP-1 are shown in Figs. 1A-1D. Specific reference nucleotide (SEQ ID 
NO: 3) and amino acid (SEQ ID NO: 4) sequences for TSP-4 are shown in Figs. 2A-2C. 
It is understood that the invention is not limited by these exemplified reference 
sequences, as variants of these sequences which differ at locations other than the SNP 
sites identified herein can also be utilized. The skilled artisan can readily determine the 
20 SNP sites in these other reference sequences which correspond to the SNP sites 
identified herein by aligning the sequence of interest with the reference sequences 
specifically disclosed herein, and programs for performing such alignments are 
commercially available. For example, the ALIGN program in the GCG software 
package can be used, utilizing a PAM120 weight residue table, a gap length penalty of 
25 12 and a gap penalty of 4, for example. 

Two SNPs have been specifically studied as described herein. The first (G334u4) 
is a change from A (reference nucleotide) to G (alternate or variant nucleotide) at 
nucleotide position 2210 of the nucleic acid sequence of TSP-1 (Figs, 1A-1D), resulting 
in a missense amino acid mutation from asparagine (reference) to serine (alternate) at 
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amino acid 700, The second SNP (G355u2) is a change from G (reference) to C 
(alternate) at nucleotide position 1 186 of the nucleic acid sequence of TSP-4 (Figs. 2A- 
2C), resulting in a missense amino acid alteration from alanine (reference) to proline 
(alternate) at amino acid 387. With respect to the G355u2 SNP, individuals with CAD 
5 carried at least one copy of the variant "C" allele more frequently than control 

individuals (43% as compared with 34%). With respect to the G355u2 SNP, individuals 
with MI carried at least one copy of the variant "C" allele more frequently than control 
individuals (49% as compared with 34%). With respect to the G334u4 SNP, 
individuals with CAD carried two copies of the variant "G" allele more frequently than 

10 control individuals (1 .7% as compared with 0.2%). With respect to the G334u4 SNP, 
individuals with MI carried two copies of the variant "G" allele more frequently than 
control individuals (2% as compared with 0.2%). 

As used herein, the term "polymorphism" refers to the occurrence of two or more 
genetically determined alternative sequences or alleles in a population. A polymorphic 

15 marker or site is the locus at which divergence occurs. Preferred markers have at least 
two alleles, each occurring at frequency of greater than 1%, and more preferably greater 
than 10% or 20% of a selected population. A polymorphic locus may be as small as one 
base pair, in which case it is referred to as a single nucleotide polymorphism (SNP). 
Thus, the invention relates to a method for predicting the likelihood that an 

20 individual will have a vascular disease, or for aiding in the diagnosis of a vascular 

disease, or predicting the likelihood of having altered symptomology associated with a 
vascular disease, comprising the steps of obtaining a DNA sample from an individual to 
be assessed and determining the nucleotide present at one or more of nucleotide 
positions 2210 of the TSP-1 gene or 1186 of the TSP-4 gene. In apreferred 

25 embodiment, the nucleotides present at both of these nucleotide positions are 

determined. In one embodiment the TSP-1 gene has the nucleotide sequence of SEQ ID 
NO: 1 and the TSP-4 gene has the nucleotide sequence of SEQ ID NO: 3. The presence 
of one or more of a G (the variant nucleotide) at position 2210 of SEQ ID NO: 1 or a C 
(the variant nucleotide) at position 1 186 of SEQ ID NO: 1 186 indicates that the 
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individual has a greater likelihood of having a vascular disease, or a greater likelihood 
of having severe symptomology associated with a vascular disease, than if that 
individual had the reference nucleotide at one or more of these positions. Conversely, 
the presence of one or more of an A (the reference nucleotide) at position 2210 of SEQ 
5 ID NO: 1 or a G (the reference nucleotide) at position 1 186 of SEQ ID NO: 3 indicates 
that the individual has a reduced likelihood of having a vascular disease or a likelihood 
of having reduced symptomology associated with a vascular disease than if that 
individual had the variant nucleotide at one or more of these positions. 

In a particular embodiment, the individual is an individual at risk for development 

10 of a vascular disease. In another embodiment the individual exhibits clinical 

symptomology associated with a vascular disease. In one embodiment, the individual 
has been clinically diagnosed as having a vascular disease. Vascular diseases include, 
but are not limited to, atherosclerosis, coronaiy heart disease, myocardial infarction 
(MI), stroke, peripheral vascular diseases, venous thromboembolism and pulmonary 

15 embolism. In preferred embodiments, the vascular disease is CAD or ML 

The genetic material to be assessed can be obtained from any nucleated cell from 
the individual. For assay of genomic DNA, virtually any biological sample (other than 
pure red blood cells) is suitable. For example, convenient tissue samples include whole 
blood, semen, saliva, tears, urine, fecal material, sweat, skin and hair. For assay of 

20 cDNA or mRNA, the tissue sample must be obtained from a tissue or organ in which 
the target nucleic acid is expressed. 

Many of the methods described herein require amplification of DNA from target 
samples. This can be accomplished by e.g., PCR. See generally PCR Technology: 
Principles and Applications for DNA Amplification (ed. H.A. Erlich, Freeman Press, 

25 NY, NY, 1992); PCR Protocols; A Guide to Methods and Applications (eds. Innis, et 
al, Academic Press, San Diego, CA, 1990); Mattila et aL, Nucleic Acids Res. 19, 4967 
(1991); Eckert et al, PCR Methods and Applications 1, 17 (1991); PCR (eds. 
McPherson et al, IRL Press, Oxford); and U.S. Patent 4,683,202. 
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Other suitable amplification methods include the ligase chain reaction (LCR) (see 
Wu and Wallace, Genomics 4, 560 (1989), Landegren et al 9 Science 241, 1077 (1988), 
transcription amplification (Kwoh et al. 9 Proc, Natl Acad, Set USA 86, 1 173 (1989)), 
and self-sustained sequence replication (Guatelli et aL, Proc. Nat. Acad, Set USA, 87, 
5 1 874 (1990)) and nucleic acid based sequence amplification (NASBA). The latter two 
amplification methods involve isothermal reactions based on isothermal transcription, 
which produce both single stranded RNA (ssRNA) and double stranded DNA (dsDNA) 
as the amplification products in a ratio of about 30 or 100 to 1, respectively. 

The nucleotide which occupies the polymorphic site of interest (e.g., nucleotide 

10 position 2210 in TSP-1 and/or nucleotide position 1 186 in TSP-4) can be identified by a 
variety of methods, such as Southern analysis of genomic DNA; direct mutation 
analysis by restriction enzyme digestion; Northern analysis of RNA; denaturing high 
pressure liquid chromatography (DHPLC); gene isolation and sequencing; hybridization 
of an allele-specific oligonucleotide with amplified gene products; single base extension 

15 (SBE). In a preferred embodiment, determination of the allelic form of TSP is carried 
out using SBE-FRET methods as described herein, or using chip-based oligonucleotide 
arrays as described herein. 

The invention also relates to a method for predicting the likelihood that an 
individual will have a vascular disease, or for aiding in the diagnosis of a vascular 

20 disease, or predicting the likelihood of having altered symptomology associated with a 
vascular disease, comprising the steps of obtaining a biological sample comprising TSP- 
1 and/or TSP-4 protein or relevant portion thereof from an individual to be assessed and 
determining the amino acid present at one or more of amino acid positions 700 of the 
TSP-1 gene product (e.g., as exemplified by SEQ ID NO: 2) or 387 of the TSP-4 gene 

25 product (e.g., as exemplified by SEQ ID NO: 4). In a preferred embodiment, the amino 
acids present at both of these amino acid positions are determined. As used herein, the 
term "relevant portion" of the TSP-1 and TSP-4 proteins is intended to encompass any 
portion of the protein which comprises the polymorphic amino acid positions. The 
presence of one or more of a serine (the variant amino acid) at position 700 of SEQ ID 
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NO: 2, or a proline (the variant amino acid) at position 387 of SEQ ID NO: 4 indicates 
that the individual has a greater likelihood of having a vascular disease, or a greater 
likelihood of having severe symptomology associated with a vascular disease, than if 
that individual had the reference amino acid at one or more of these positions. 
5 Conversely, the presence of one or more of an asparagine (the reference amino acid) at 
position 700 of SEQ ID NO: 2, or an alanine (the reference amino acid) at position 387 
of SEQ I D NO: 4 indicates that the individual has a reduced likelihood of having a 
vascular disease or a likelihood of having reduced symptomology associated with a 
vascular disease, than if that individual had the varaint amino acid at one or more of 
10 these positions. 

In a particular embodiment, the individual is an individual at risk for development 
of a vascular disease. In another embodiment the individual exhibits clinical 
symptomology associated with a vascular disease. In one embodiment, the individual 
has been clinically diagnosed as having a vascular disease. 

15 In this embodiment of the invention, the biological sample contains protein 

molecules from the test subject. In vitro techniques for detection of protein include 
enzyme linked immunosorbent assays (ELIS As), Western blots, immunoprecipitations 
and immunofluorescence. Furthermore, in vivo techniques for detection of protein 
include introducing into a subject a labeled anti-protein antibody. For example, the 

20 antibody can be labeled with a radioactive marker whose presence and location in a 

subject can be detected by standard imaging techniques. Polyclonal and/or monoclonal 
antibodies that specifically bind to variant gene products but not to corresponding 
reference gene products, and vice versa, are also provided. Antibodies can be made by 
injecting mice or other animals with the variant gene product or synthetic peptide 

25 fragments thereof comprising the variant portion. Monoclonal antibodies are screened 
as are described, for example, in Harlow & Lane, Antibodies, A Laboratory Manual, 
Cold Spring Harbor Press, New York (1988); Goding, Monoclonal antibodies, 
Principles and Practice (2d ed.) Academic Press, New York (1986). Monoclonal 
antibodies are tested for specific immunoreactivity with a variant gene product and lack 
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of immunoreactivity to the corresponding prototypical gene product. These antibodies 
are useful in diagnostic assays for detection of the variant form, or as an active 
ingredient in a pharmaceutical composition. 

The polymorphisms of the invention may be associated with vascular disease in 
5 different ways. The polymorphisms may exert phenotypic effects indirectly via 
influence on replication, transcription, and translation. Additionally, the described 
polymorphisms may predispose an individual to a distinct mutation that is causally 
related to a certain phenotype, such as susceptibility or resistance to vascular disease 
and related disorders. The discovery of the polymorphisms and their correlation with 

10 CAD and MI facilitates biochemical analysis of the variant and reference forms and the 
development of assays to characterize the variant and reference forms and to screen for 
pharmaceutical agents that interact directly with one or another form of the protein. 

Alternatively, these particular polymorphisms may belong to a group of two or 
more polymorphisms in the TSP gene(s) which contributes to the presence, absence or 

15 severity of vascular disease. An assessment of other polymorphisms within the TSP 
gene(s) can be undertaken, and the separate and combined effects of these 
polymorphisms, as well as alternations in other, distinct genes, on the vascular disease 
phenotype can be assessed. 

Correlation between a particular phenotype, e.g., the CAD or MI phenotype, and 

20 the presence or absence of a particular allele is performed for a population of 
individuals who have been tested for the presence or absence of the phenotype. 
Correlation can be performed by standard statistical methods such as a Chi-squared test 
and statistically significant correlations between polymorphic form(s) and phenotypic 
characteristics are noted. This correlation can be exploited in several ways. In the case 

25 of a strong correlation between a particular polymorphic form, e.g., the variant allele for 
TSP-1 and/or TSP-4, and a disease for which treatment is available, detection of the 
polymorphic form in an individual may justify immediate administration of treatment, 
or at least the institution of regular monitoring of the individual. Detection of a 
polymorphic form correlated with a disorder in a couple contemplating a family may 
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also be valuable to the couple in their reproductive decisions. For example, the female 
partner might elect to undergo in vitro fertilization to avoid the possibility of 
transmitting such a polymorphism from her husband to her offspring. In the case of a 
weaker, but still statistically significant correlation between a polymorphic form and a 
5 particular disorder, immediate therapeutic intervention or monitoring may not be 
justified. Nevertheless, the individual can be motivated to begin simple life-style 
changes (e.g., diet modification, therapy or counseling) that can be accomplished at 
little cost to the individual but confer potential benefits in reducing the risk of 
conditions to which the individual may have increased susceptibility by virtue of the 

10 particular allele. Furthermore, identification of a polymorphic form correlated with 
enhanced receptiveness to one of several treatment regimes for a disorder indicates that 
this treatment regimen should be followed for the individual in question. 

Furthermore, it may be possible to identify a physical linkage between a genetic 
locus associated with a trait of interest (e.g., CAD or MI) and polymorphic markers that 

15 are or are not associated with the trait, but are in physical proximity with the genetic 
locus responsible for the trait and co-segregate with it. Such analysis is useful for 
mapping a genetic locus associated with a phenotypic trait to a chromosomal position, 
and thereby cloning gene(s) responsible for the trait. See Lander et ah, Proc. Natl 
Acad. Sci. (USA) 83, 7353-7357 (1986); Lander et al, Proc. Natl. Acad. Set (USA) 84, 

20 2363-2367 (1987); Donis-Keller et ah, Cell 51, 319-337 (1987); Lander et al., Genetics 
121,1 85-199 (1989)). Genes localized by linkage can be cloned by a process known as 
directional cloning. See Wainwright, Med. J. Australia 159, 170-174 (1993); Collins, 
Nature Genetics 1,3-6 (1992). Linkage studies are discussed in more detail above. 
In another embodiment, the invention relates to pharmaceutical compositions 

25 comprising a reference TSP-1 and/or TSP-4 gene or gene product for use in the 

treatment of vascular disease, e.g., CAD and MI. As used herein, a reference TSP gene 
product is intended to mean gene products which are encoded by the reference allele of 
the TSP gene. In addition to substantially full-length polypeptides expressed by the 
genes, the present invention includes biologically active fragments of the polypeptides, 
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or analogs thereof, including organic molecules which simulate the interactions of the 
peptides. Biologically active fragments include any portion of the full-length 
polypeptide which confers a biological function on the variant gene product, including 
ligand binding, and antibody binding. Ligand binding includes binding by nucleic 
5 acids, proteins or polypeptides, small biologically active molecules, or large cellular 
structures. 

For instance, the polypeptide or protein, or fragment thereof, of the present 
invention can be formulated with a physiologically acceptable medium to prepare a 
pharmaceutical composition. The particular physiological medium may include, but is 

10 not limited to, water, buffered saline, polyols (e.g., glycerol, propylene glycol, liquid 
polyethylene glycol) and dextrose solutions. The optimum concentration of the active 
ingredient(s) in the chosen medium can be determined empirically, according to 
procedures well known to medicinal chemists, and will depend on the ultimate 
pharmaceutical formulation desired. Methods of introduction of exogenous peptides at 

15 the site of treatment include, but are not limited to, intradermal, intramuscular, 

intraperitoneal, intravenous, subcutaneous, oral and intranasal. Other suitable methods 
of introduction can also include rechargeable or biodegradable devices and slow release 
polymeric devices. The pharmaceutical compositions of this invention can also be 
administered as part of a combinatorial therapy with other agents and treatment 

20 regimens. 

The invention further pertains to compositions, e.g., vectors, comprising a 
nucleotide sequence encoding reference or variant TSP-1 and/or TSP-4 gene products. 
For example, reference genes can be expressed in an expression vector in which a 
reference gene is operably linked to a native or other promoter. Usually, the promoter is 
25 a eukaryotic promoter for expression in a mammalian cell. The transcription regulation 
sequences typically include a heterologous promoter and optionally an enhancer which 
is recognized by the host. The selection of an appropriate promoter, for example trp, 
lac, phage promoters, glycolytic enzyme promoters and tRNA promoters, depends on 
the host selected. Commercially available expression vectors can be used. Vectors can 
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include host-recognized replication systems, amplifiable genes, selectable markers, host 
sequences useful for insertion into the host genome, and the like. 

The means of introducing the expression construct into a host cell varies 
depending upon the particular construction and the target host. Suitable means include 
5 fusion, conjugation, transfection, transduction, electroporation or injection, as described 
in Sambrook, supra. A wide variety of host cells can be employed for expression of the 
variant gene, both prokaryotic and eukaryotic. Suitable host cells include bacteria such 
as E. coli, yeast, filamentous fungi, insect cells, mammalian cells, typically 
immortalized, e.g., mouse, CHO, human and monkey cell lines and derivatives thereof. 

10 Preferred host cells are able to process the variant gene product to produce an 

appropriate mature polypeptide. Processing includes glycosylation, ubiquitination, 
disulfide bond formation, general post-translational modification, and the like. 

It is also contemplated that cells can be engineered to express the reference allele 
of the invention by gene therapy methods. For example, DNA encoding the reference 

15 TSP gene product, or an active fragment or derivative thereof, can be introduced into an 
expression vector, such as a viral vector, and the vector can be introduced into 
appropriate cells in an animal. In such a method, the cell population can be engineered 
to inducibly or constitutively express active reference TSP gene product. In a preferred 
embodiment, the vector is delivered to the bone marrow, for example as described in 

20 Corey et ah {Science 244:1275-1281 (1989)). 

The invention further relates to the use of compositions (i.e., agonists) which 
enhance or increase the activity of the reference (or variant) TSP (e.g., TSP-1 or TSP-4) 
gene product, or a functional portion thereof, for use in the treatment of vascular 
disease. The invention also relates to the use of compositions (i.e., antagonists) which 

25 reduce or decrease the activity of the variant (or reference) TSP (e.g., TSP-1 or TSP-4) 
gene product, or a functional portion thereof, for use in the treatment of vascular 
disease. 

The invention also relates to constructs which comprise a vector into which a 
sequence of the invention has been inserted in a sense or antisense orientation. For 
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example, a vector comprising a nucleotide sequence which is antisense to the variant 
TSP-1 or TSP-4 allele may be used as an antagonist of the activity of the TSP-1 or TSP- 
4 variant allele. Alternatively, a vector comprising a nucleotide sequence of the TSP-1 
or TSP-4 reference allele may be used therapeutically to treat vascular diseases. As 
5 used herein, the term "vector" refers to a nucleic acid molecule capable of transporting 
another nucleic acid to which it has been linked. One type of vector is a "plasmid", 
which refers to a circular double stranded DNA loop into which additional DNA 
segments can be ligated. Another type of vector is a viral vector, wherein additional 
DNA segments can be ligated into the viral genome. Certain vectors are capable of 

10 autonomous replication in a host cell into which they are introduced (e.g., bacterial 
vectors having a bacterial origin of replication and episomal mammalian vectors). 
Other vectors (e.g., non-episomal mammalian vectors) are integrated into the genome of 
a host cell upon introduction into the host cell, and thereby are replicated along with the 
host genome. Moreover, certain vectors, expression vectors, are capable of directing the 

15 expression of genes to which they are operably linked. In general, expression vectors of 
utility in recombinant DNA techniques are often in the form of plasmids (vectors). 
However, the invention is intended to include such other forms of expression vectors, 
such as viral vectors (e.g. , replication defective retroviruses, adenoviruses and 
adeno-associated viruses) that serve equivalent functions. 

20 Preferred recombinant expression vectors of the invention comprise a nucleic acid 

of the invention in a form suitable for expression of the nucleic acid in a host cell. This 
means that the recombinant expression vectors include one or more regulatory 
sequences, selected on the basis of the host cells to be used for expression, which is 
operably linked to the nucleic acid sequence to be expressed. Within a recombinant 

25 expression vector, "operably linked" is intended to mean that the nucleotide sequence of 
interest is linked to the regulatory sequence(s) in a manner which allows for expression 
of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a 
host cell when the vector is introduced into the host cell). The term "regulatory 
sequence" is intended to include promoters, enhancers and other expression control 
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elements (e.g., polyadenylation signals). Such regulatory sequences are described, for 
example, in Goeddel, Gene Expression Technology: Methods in Enzymology 185, 
Academic Press, San Diego, CA (1990). Regulatory sequences include those which 
direct constitutive expression of a nucleotide sequence in many types of host cell and 
5 those which direct expression of the nucleotide sequence only in certain host cells (e.g., 
tissue-specific regulatory sequences). It will be appreciated by those skilled in the art 
that the design of the expression vector can depend on such factors as the choice of the 
host cell to be transformed, the level of expression of protein desired, etc. 

The expression vectors of the invention can be introduced into host cells to 

10 thereby produce proteins or peptides, including fusion proteins or peptides, encoded by 
nucleic acids as described herein . The recombinant expression vectors of the invention 
can be designed for expression of a polypeptide of the invention in prokaryotic or 
eukaryotic cells, e.g., bacterial cells such as E. coli, insect cells (using baculovirus 
expression vectors), yeast cells or mammalian cells. Suitable host cells are discussed 

15 further in Goeddel, supra. Alternatively, the recombinant expression vector can be 

transcribed and translated in vitro, for example using T7 promoter regulatory sequences 
and T7 polymerase. 

Another aspect of the invention pertains to host cells into which a recombinant 
expression vector of the invention has been introduced. The terms "host cell" and 

20 "recombinant host cell" are used interchangeably herein. It is understood that such 
terms refer not only to the particular subject cell but also to the progeny or potential 
progeny of such a cell. Because certain modifications may occur in succeeding 
generations due to either mutation or environmental influences, such progeny may not, 
in fact, be identical to the parent cell, but are still included within the scope of the term 

25 as used herein. A host cell can be any prokaryotic or eukaryotic cell. For example, a 
nucleic acid of the invention can be expressed in bacterial cells (e.g.,E. coli), insect 
cells, yeast or mammalian cells (such as Chinese hamster ovary cells (CHO) or COS 
cells). Other suitable host cells are known to those skilled in the art. 
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Vector DNA can be introduced into prokaryotic or eukaryotic cells via 
conventional transformation or transfection techniques. As used herein, the terms 
"transformation" and "transfection" are intended to refer to a variety of art-recognized 
techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including 
5 calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated 
transfection, lipofection, or electroporation. Suitable methods for transforming or 
transfecting host cells can be found in Sambrook, et ah (supra), and other laboratory 
manuals. 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in 

10 culture, can be used to produce (i.e., express) a polypeptide of the invention. 

Accordingly, the invention further provides methods for producing a polypeptide using 
the host cells of the invention. In one embodiment, the method comprises culturing the 
host cell of the invention (into which a recombinant expression vector encoding a 
polypeptide of the invention has been introduced) in a suitable medium such that the 

15 polypeptide is produced. In another embodiment, the method further comprises 
isolating the polypeptide from the medium or the host cell. 

The host cells of the invention can also be used to produce nonhuman transgenic 
animals. For example, in one embodiment, a host cell of the invention is a fertilized 
oocyte or an embryonic stem cell into which a nucleic acid of the invention has been 

20 introduced. Such host cells can then be used to create non-human transgenic animals in 
which exogenous nucleotide sequences have been introduced into their genome or 
homologous recombinant animals in which endogenous nucleotide sequences have been 
altered. Such animals are useful for studying the function and/or activity of the 
nucleotide sequence and polypeptide encoded by the sequence and for identifying 

25 and/or evaluating modulators of their activity. As used herein, a "transgenic animal" is 
a non-human animal, preferably a mammal, more preferably a rodent such as a rat or 
mouse, in which one or more of the cells of the animal includes a transgene. Other 
examples of transgenic animals include non-human primates, sheep, dogs, cows, goats, 
chickens, amphibians, etc. A transgene is exogenous DNA which is integrated into the 
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genome of a cell from which a transgenic animal develops and which remains in the 
genome of the mature animal, thereby directing the expression of an encoded gene 
product in one or more cell types or tissues of the transgenic animal. As used herein, an 
"homologous recombinant animal" is a non-human animal, preferably a mammal, more 
5 preferably a mouse, in which an endogenous gene has been altered by homologous 
recombination between the endogenous gene and an exogenous DNA molecule 
introduced into a cell of the animal, e.g. , an embryonic cell of the animal, prior to 
development of the animal. 

A transgenic animal of the invention can be created by introducing a nucleic acid 

10 of the invention into the male pronuclei of a fertilized oocyte, e.g., by microinjection, 
retroviral infection, and allowing the oocyte to develop in a pseudopregnant female 
foster animal. The sequence can be introduced as a transgene into the genome of a 
non-human animal. Intronic sequences and polyadenylation signals can also be included 
in the transgene to increase the efficiency of expression of the transgene. A 

1 5 tissue-specific regulatory sequence(s) can be operably linked to the transgene to direct 
expression of a polypeptide in particular cells. Methods for generating transgenic 
animals via embryo manipulation and microinjection, particularly animals such as mice, 
have become conventional in the art and are described, for example, in U.S. Patent Nos. 
4,736,866 and 4,870,009, U.S. Patent No. 4,873,191 and in Hogan, Manipulating the 

20 Mouse Embryo (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 
1986). Similar methods are used for production of other transgenic animals. A 
transgenic founder animal can be identified based upon the presence of the transgene in 
its genome and/or expression of mRNA in tissues or cells of the animals. A transgenic 
founder animal can then be used to breed additional animals carrying the transgene. 

25 Moreover, transgenic animals carrying a transgene encoding the transgene can further 
be bred to other transgenic animals carrying other transgenes. 

The invention also relates to the use of the variant and reference gene products to 
guide efforts to identify the causative mutation for vascular diseases or to identify or 
synthesize agents useful in the treatment of vascular diseases, e.g., CAD and MI. 



2825.1027-001 




-46- 



Amino acids that are essential for function can be identified by methods known in the 
art, such as site-directed mutagenesis or alanine-scanning mutagenesis (Cunningham et 
al, Science, 2^:1081-1085 (1989)). The latter procedure introduces single alanine 
mutations at every residue in the molecule. The resulting mutant molecules are then 
5 tested for biological activity in vitro, or in vitro activity. Sites that are critical for 
polypeptide activity can also be determined by structural analysis such as 
crystallization, nuclear magnetic resonance or photoaffmity labeling (Smith et al, J. 
Mol Biol, 224:899-904 (1992); de Vos et al Science, 255:306-312 (1992)). 

Another aspect of the invention pertains to monitoring the influence of agents 

10 (e.g., drugs, compounds) on the expression or activity of proteins of the invention in 
clinical trials. An exemplary method for detecting the presence or absence of proteins 
or nucleic acids of the invention in a biological sample involves obtaining a biological 
sample from a test subject and contacting the biological sample with a compound or an 
agent capable of detecting the protein, or nucleic acid (e.g., mRNA, genomic DNA) that 

15 encodes the protein, such that the presence of the protein or nucleic acid is detected in 
the biological sample. A preferred agent for detecting mRNA or genomic DNA is a 
labeled nucleic acid probe capable of hybridizing to mRNA or genomic DNA sequences 
described herein, preferably in an allele-specific manner. The nucleic acid probe can be, 
for example, a full-length nucleic acid, or a portion thereof, such as an oligonucleotide 

20 of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to specifically 
hybridize under stringent conditions to appropriate mRNA or genomic DNA. Other 
suitable probes for use in the diagnostic assays of the invention are described herein. 

The invention also encompasses kits for detecting the presence of proteins or 
nucleic acid molecules of the invention in a biological sample. For example, the kit can 

25 comprise a labeled compound or agent (e.g., nucleic acid probe) capable of detecting 
protein or mRNA in a biological sample; means for determining the amount of protein 
or mRNA in the sample; and means for comparing the amount of protein or mRNA in 
the sample with a standard. The compound or agent can be packaged in a suitable 
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container. The kit can further comprise instructions for using the kit to detect protein or 
nucleic acid. 

The following Examples are offered for the purpose of illustrating the present 
invention and are not to be construed to limit the scope of this invention. The teachings 
5 of all references cited herein are hereby incorporated herein by reference. 

EXAMPLES 

Identification of Single Nucleotide Polymorphisms 

The polymorphisms shown in the Table were identified by resequencing of target 
sequences from individuals of diverse ethnic and geographic backgrounds by 

10 hybridization to probes immobilized to micro fabricated arrays. The strategy and 

principles for design and use of such arrays are generally described in WO 95/1 1995. 

A typical probe array used in this analysis has two groups of four sets of probes 
that respectively tile both strands of a reference sequence. A first probe set comprises a 
plurality of probes exhibiting perfect complementarily with one of the reference 

1 5 sequences. Each probe in the first probe set has an interrogation position that 

corresponds to a nucleotide in the reference sequence. That is, the interrogation position 
is aligned with the corresponding nucleotide in the reference sequence, when the probe 
and reference sequence are aligned to maximize complementarily between the two. For 
each probe in the first set, there are three corresponding probes from three additional 

20 probe sets. Thus, there are four probes corresponding to each nucleotide in the 

reference sequence. The probes from the three additional probe sets are identical to the 
corresponding probe from the first probe set except at the interrogation position, which 
occurs in the same position in each of the four corresponding probes from the four probe 
sets, and is occupied by a different nucleotide in the four probe sets. In the present 

25 analysis, probes were 25 nucleotides long. Arrays tiled for multiple different references 
sequences were included on the same substrate. 
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Publicly available sequences for a given gene were assembled into Gap4 
(http://www.biozentrujm.u^ PCR primers 

covering each exon were designed using Primer 3 (http://www-genome.wi.mit.edu/cgi- 
bin/primer/primer3,cgi). Primers were not designed in regions where there were 
5 sequence discrepancies between reads. Genomic DNA was amplified in at least 50 
individuals using 2.5 pmol each primer, 1.5 mM MgCl 2 , 100 \XM dNTPs, 0.75 |LlM 
AmpliTaq GOLD polymerase, and 19 ng DNA in a 15 |il reaction. Reactions were 
assembled using a PACKARD MultiPROBE robotic pipetting station and then put in 
MJ 96-well tetrad thermocyclers (96°C for 10 minutes, followed by 35 cycles of 96°C 

10 for 30 seconds, 59°C for 2 minutes, and 72°C for 2 minutes). A subset of the PCR 
assays for each individual were run on 3% NuSieve gels in 0.5X TBE to confirm that 
the reaction worked. 

For a given DNA, 5 |JLl (about 50 ng) of each PCR or RT-PCR product were 
pooled (Final volume = 150-200 (J,l). The products were purified using QiaQuick PCR 

15 purification from Qiagen. The samples were eluted once in 35 |ll sterile water and 4 (il 
10X One-Phor-All buffer (Pharmacia). The pooled samples were digested with 0.2 [X 
DNasel (Promega)for 10 minutes at 37°C and then labeled with 0.5 nmols biotin-N6- 
ddATP and 15 [I Terminal Transferase (GibcoBRL Life Technology) for 60 minutes at 
37°C. Both fragmentation and labeling reactions were terminated by incubating the 

20 pooled sample for 15 minutes at 100°C. 

Low-density DNA chips (Affymetrix,CA) were hybridized following the 
manufacturer's instructions. Briefly, the hybridization cocktail consisted of 3M 
TMAC1, 10 mM Tris pH 7.8, 0.01% Triton X-100, 100 mg/ml herring sperm DNA 
(Gibco BRL), 200 pM control biotin-labeled oligo. The processed PCR products were 

25 denatured for 7 minutes at 100°C and then added to prewarmed (37°C) hybridization 
solution. The chips were hybridized overnight at 44°C. Chips were washed in IX 
SSPET and 6X SSPET followed by staining with 2 |ig/ml SARPE and 0.5 mg/ml 
acetylated BSA in 200 \l\ of 6X SSPET for 8 minutes at room temperature. Chips were 
scanned using a Molecular Dynamics scanner. 
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Chip image files were analyzed using Ulysses (Affymetrix, CA) which uses four 
algorithms to identify potential polymorphisms. Candidate polymorphisms were 
visually inspected and assigned a confidence value: high confidence candidates 
displayed all three genotypes, while likely candidates showed only two genotypes 
5 (homozygous for reference sequence and heterozygous for reference and variant). Some 
of the candidate polymorphisms were confirmed by ABI sequencing. Identified 
polymorphisms were compared to several databases to determine if they were novel. 
Results are shown in the Table. 

Association of Thrombospondin Gene Polymorphisms with Vascular Disease 
10 To determine pivotal genes associated with premature coronary artery disease, we 

analyzed DNA from 347 patients with MI or coronary revascularization before age 40 
(men) or 45 (women) and 422 general population controls. Cases were drawn (one per 
family) from a retrospective collection of sibling pairs with premature CAD. Controls 
were ascertained through random-digit dialing. Both cases and controls were 
1 5 Caucasian. A complete database of phenotypic and laboratory variables for the affected 
patients afforded logistic regression to control for age, diabetes, body mass index, 
gender. 

Thrombospondin (TSP) 4 and 1 emerged as important SNPs associated with 
premature CAD and ML For CAD, 148 of 347 patients carried at least one copy of the 
20 TSP-4 variant compared with 142 of 422 control subjects; adjusted odds ratio 1.47, 
p=0.0L For premature MI, the association was even stronger: 91 of 187 cases vs. 142 
of 422 controls had the variant; adjusted odds ratio 2.08, p=0.0003. The TSP-1 SNP 
was rare. Nonetheless, homozygosity for the variant allele gave an adjusted odds ratio 
of9.5,p-04. 
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L transporting, beta 2 polypeptide 
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ACAGCCAAGA [A/C] CTGGGAACTC 
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SDF1, stromal cell -derived factor 
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PDHAl, pyruvate dehydrogenase 
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PDHB, pyruvate dehydrogenase 
> (lipoamide) beta 


IL11RA, interleukin 11 receptor, 
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RRMl, ribonucleotide reductase Ml 
polypeptide j. 


RRMl, ribonucleotide reductase Ml 
polypeptide 


RRMl, ribonucleotide reductase Ml 
polypeptide 


RRMl, ribonucleotide reductase Ml 
polypeptide 
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POLR2B, polymerase (RNA) II (DNA 
directed) polypeptide B (14 0kD) 


P0LR2B, polymerase (RNA) II (DNA 
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POLR2B, polymerase (RNA) II (DNA 
directed) polypeptide B (140kD) 


P0LR2B, polymerase (RNA) II (DNA 
directed) polypeptide B (140kD) 


POLR2B, polymerase (RNA) II (DNA 
directed) polypeptide B (140kD) 


P0LR2A, polymerase (RNA) II (DNA 
directed) polypeptide A (22 0kD) 
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POLR2A, polymerase (RNA) II (DNA 
directed) polypeptide A (220kD) 


P0LR2A, polymerase (RNA) II (DNA 
directed) polypeptide A (220kD) 


POLR2A, polymerase (RNA) II (DNA 
directed) polypeptide A (220kD) 
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directed) polypeptide A (220kD) 


POLR2A, polymerase (RNA) II (DNA 
directed) polypeptide A (220kD) 
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alpha (low affinity) 
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interleukin 1 beta convertase 


interleukin 1 beta convertase 
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(phosphatidylinositol -specif ic) 
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precursor, mRNA, complete cds. ( 
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Human glycoprotein receptor gp330 
precursor, mRNA, complete cds. 


Human glycoprotein receptor gp330 
precursor, mRNA, complete cds. 


Human glycoprotein receptor gp330 
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Human glycoprotein receptor gp330 
precursor, mRNA, complete cds. 


Human glycoprotein receptor gp330 
precursor, mRNA, complete cds. 


Human glycoprotein receptor gp330 
precursor, mRNA, complete cds. 


Human glycoprotein receptor gp330 
precursor, mRNA, complete cds. 


TLEl , transducin-like enhancer of 
split 1, homolog of Drosophila 
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split 1, homolog of Drosophila 
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split 1, homolog of Drosophila 
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TLE2, transducin-like enhancer of 
split 2, homolog of Drosophila 
E(spl) 


TLE2, transducin-like enhancer of 
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STFATC3, nuclear factor of 
activated T- cells, cytoplasmic 
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activated T-cells, cytoplasmic 
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NFATC3, nuclear factor of 
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GGCTGTGAAC [C/T] GTGTGCCTGC 


CCTGACAGGA [C/T] TGGAGAAGCG 


TGCTGCGCTC [A/G] GCGGACCTGA 


TGGAGCAGGA [G/T ] AAGCACCGGC 


CGTTTGGCAG [G/A] GCAGCCAGGC 


GGTCCCAATG [G/A] GCAAGGAGCC 


CCACAGAGAT [C/T] CCTGACTTCA 


TTTGGGATGA [C/T] TCCAGCTGCC 


TTCAGGCTTA [T/C] GGTATGAAGC 
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GATTCTCCTC [G/A] GGCATCACAG 


TTGAGTTCCA [C/T] TGTGCTGTGC 
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LAMB3, laminin, beta 3 (nicein 
(125kD) , kalinin (140kD) , BM600 
(125kD)) 


LAMB3 , laminin, beta 3 (nicein 
(125kD) , kalinin (140kD) , BM600 
(125kD) ) 


LAMB3 , laminin, beta 3 (nicein 
(125kD) , kalinin (140kD) , BM600 
(125kD) ) 


adducin, beta subunit | 


villin 


villin 


villin 


villin I 
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HSPG2 / heparan sulfate 
proteoglycan 2 (perlecan) 


HSPG2, heparan sulfate 
proteoglycan 2 (perlecan) 


HSPG2, heparan sulfate 
proteoglycan 2 (perlecan) 


HSPG2, heparan sulfate 
proteoglycan 2 (perlecan) 


HSPG2 , heparan sulfate 
proteoglycan 2 (perlecan) 


HSPG2 , heparan sulfate 
proteoglycan 2 (perlecan) 
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jmotor protein 


|motor protein 


phosphoglucomutase -related protein 


phosphoglucomutase- related protein 


KIF5A, kinesin family member 5A 


KIF5A, kinesin family member 5A 


1553 


3562 


3546 


1266 


1366 


1468 


1932 


2438 


iH 
O 

ro 


LD 
\D 


r- 
t-- 


LD 
CN 


13130 


10340 


12392 


3416 


4588 


9582 


cn 
ro 


m 
in 


1150 


1238 


1043 


ro 


2767 


HT4211 


HT4211 


HT4211 


HT0652 


u> 

VD 
H 

EH 


HT1466 


HT1466 


HT1466 


HT33633 


HT4301 


HT4301 


HT4301 


HT1396 


HT1396 


HT1396 


1 

HT1396 


HT1396 


HT1396 


HT4237 


HT4237 


HT4237 


HT28223 


HT28223 


HT4401 


HT4401 


WIAF-14026 


WIAF-14029 


1 

WIAF-14030 


WIAF-13571 


\.D 
O 
H 

T}H 

tH 

fa 

H 
& 


WIAF-14107 


WIAF-14108 


WIAF-14110 


WIAF-13648 


WIAF-13676 


WIAF-13677 


CO 

o 
r> 
ro 
rH 

a 

H 

3: 


WIAF-14142 


i 

WIAF-14150 


WIAF-14151 


CN 
LO 
tH 

rH 

fa 
< 

tH 


WIAF-14154 


WIAF-14156 


WIAF-13890 


o 

H 

cn 

CO 
rH 

1 

fa 
< 


iWIAF-13911 


WIAF-14034 


WIAF-14035 


WIAF-13615 




WIAF-13623 


cn 
p( 

CO 

ro 
o 

cd 


G4038U10 


G4038ull 1 


rH 
Pi 

m 
o 
CD 


iH 
p! 
O 
LT) 
O 

|cd 


CM 
pi 
O 

in 
o 

1 ° 


ro 
pi 
o 

LT) 
O 
*t 

CD 


G4050U4 


G4057U1 


KD 
O 
^ 
CD 


CN 

yjD 
<o 
o 

*i> 
CD 


ro 

V£> 
O 

CD 


G4080ul 


CN 
pi 

o 

CO 

o 
CD 


G4080U3 


G4080U4 


in 

P> 
o 

CO 

o 
^ 
CD 


O 
CO 

o 
CD 


rH 
U> 

cn 
o 

CD 


G4096U2 


ro 
pi 

VD 

cn 
o 

CD 


G4109U1 


CN 
pi 

cn 
o 

H 
<tf 
CD 


G4112ul 


G4112U2 



2825.1027-001 



-138- 



C5 
O 
Eh 



a 



0) 



o 



Cn 



■rH 



M 

O 

0 

6 rH 



g. 



- dJ 

in u 



Eh 



Pi >, 

■H in 

to O 

o . 



tO o 

O 4J 

>, (<} 



g. 



rC t 



CO 
CQ 



o 

U 

(X . 



s a) fo 

)h -h 2 

4-> -P H 

& cn 

« e § 

O O (fl 

Eh a cn 



8 

M -H 



2 CO 
rH f0 

Cn u 



B oj 

rtf 4J ._ 

4-> O M 

2 in -P 

H PiH 

Cn >i 

cn - £ 

ii| d> 

ni 4J 

U -P H 

a cn 

^ 1 

- ft rrf 

ro N 6 

O O (ti 

eh a cn 



o 

rH 

o 

■H 4J 

& 

CI) u 



O 
in 

o a) 



Eh 



0) 



rH O 

-H S 

^ o 

a ^ 
o 

CO 

o 

iH 

D CO 

d> £ 

Cn -h 

•H U 

to in 

J d> 



5J 

a > 

■H -H 

Jh 4-> 
<U 

a to 

a - 

■H 

a) *c 

4-> 

O 

a t3 

rH 

H (1) 

cn M-i 

rH CD 

O rl 

o & 



U 4J 
d) -H 

a w 
cj 

« CO 

rH I 

d) < 

4-) 

o a 

H -H 

a ^ 

•rl 0) 

Cn ^h 

rH d) 

o 



r0 



^ d) 

a > 



a to 

PJ CO 
H 
d> 
4-> 
O 
SH 

a 

rH 

■rl CI) 



o u 
CD ^ 



(TJ 

Sh 
oj 

■a 

•H 

n 
d) 
a « 

d) *c 

4J 
O 

H -r- 

a 

rH 
-H CU 

Cn 4-4 

rH (U 

O Sh 



Eh 





rH 


a > 


-H 


•H -rH 




SH 4J 




(U -rH 


4H 


a to 






f! 


- 0) 


-H 


Fl « 


CO 


■H i 




<u <; 


a 




-H 






U -r 




a tj 




rH 




-rH 0) 


f« 


Cn hh 


if) 


rH <D 


fa 


0 u 


H 


CD -Q 




r- 




o 




o 




rH 





2825.1027-001 



-139- 



Eh 



a 

o 

Eh 
O 



Eh 
& 

Eh 



u 

Eh 
O 



Eh 
U 



Eh 
U 
O 



Jh 
4-1 



H 



EH 



E-< 



w 

Eh -n 



U 
01 

fd 
Pi 
cd 

4J fl 
01 



cu 

-U . . 
-H 

Eh -n 



4J 



0 


H 




in 


H 






H 






1-1 




§■ 


0 




-H 


4-» 








a 




<u 




0 


u 


o 


4-1 


a 


o 


Ul 


!h 


CO 


a 








m 






4-> 


f= 


4-1 


0) 






£1 


u 



m in U) 

# O rd 

PQ 4-i -P 

h U (D 

Eh W ^ 



0 H 

01 M 

a o 



u <u 

o o 

m <d 

w u 

rd rd 

in 4J « 

■ - r-4 

m In 01 

04 O rd 

PQ -U 4-> 

Cu O <D 

O rd ^ 
Eh 



CM 

2 



si 






n 

0 


M 




u 


M 






H 






H 






o 




■H 


4J 






ft 


Q 




CU 




0 


u 


o 


M-4 


(U 


o 


CO 


!h 


ro 








§ 


rd 






4-> 


d" 


4J 


<U 


rt 




XI 


o 









ro k 01 

O fd 

PQ 4-> 4-J 

h U fl) 

CD fd X! 
Eh M-l 



£i 

4-1 



o 


H 




^1 


1— 1 




01 


hH 












0 




•H 


4-1 






ft 


Q 




<D 




o 


U 


o 


<4-J 


CD 


o 




fn 


ro 








fd 


rd 




U 


■P 


C* 


4_> 


CU 


ftJ 




XI 


O 







ro in Ol 

CtJ O nJ 

CQ 4-> 4-J 

h O flJ 

u fd xi 

Eh 4-J 









4-> 












o 


H 




u 


H 




Oi 


H 






U 




a 1 


o 




•H 










a 




cu 




O 


u 


o 


M-l 


<L> 


o 


CO 




ro 


Pi 






rd 


rd 




Jh- 


4-1 


c 


4-> 


0) 


n: 



2825.1027-001 



-140- 



CJ 

u 

CD 
O 
H 
CJ 

o 

H 
U 



3 

CJ 

o 



0 H 
)h H 

01 H 



in 
o 
-p 

ft q 

0J >J 



Q £ 

- r-H 

Sh Oi 

o «j 



p4 

E-t 4-1 



a 



O H 

Sh H 

to m 

in in 

a o 



ft Q 



a) 



o 

CQ -U 
Eh 4h w - 



4-> 



Sh 
O 
-P 

ft Q 



Sh cn 
O rti 



E-i M-l ' — ' 
m 



B H 

U H 
01 H 

01 n 
CJ o 



ft Q 

o * 

<U 

in 



ro 



PS O 

pq -p 



ft ■ 



o 

ft 



PQ +) 

Eh >i 

& Sh 



4J <H 
O 
CJ U 
ft 

CO . 
>. 
O 

- o 
m 

Eh >i 
Cm in 
CO (D 



Eh 
CJ 



Eh 



<3 



(U o 

ft -H 

CO 4J 

o 

- o 

H Sh 

m -p 

B >. 

Cm Sh 

CO 0) 



ft 



S3 

CM 
CO QJ 



CJ 
CJ 
CJ 
CJ 

u 

Eh 

u 

Eh 
CJ 



u 

CD 
CD 



T3 

-H 
U 
O 
01 



SH 



CM En 
Eh S 



Eh 
Eh 
O 

o 

U 



CJ 



Sh 



to 



sh 



a 

a; 



cj 
u 

CJ 

u 

Eh 

u 

CJ 
Eh 
CJ 
H 



o 

Eh 



ID 
S3 



3 



2825.1027-001 



-141- 



< 



< 

u 

o 
u 

o 
u 

s 

o 



•3 

O 
Eh 

u 
u 

Eh 



CD 
Eh 
CJ 

a 
S3 

U 



cd ri| 

a cd 

to w 
o nj 

ft 

CM 



PI Pi 

rd rJ 

rH O 
rH ft 

o ai 
u o 

«1 

<D u 
tn ^ 

-H 4J 
4J 

Pi - 

fd n 



rd 

e pi 

O -H 

m <D 

O 4-> 



a (d h 
a o i 



o 
ti 



rH . . 

o fd 

4J ti 

ft tJI 

(U (U 

° 

<u Pi 

r) 



Pi Pi 

CD -H 

tn tf 
rd pJ 

rH O . 
rH ft 



tJI rC| rH 



ft 

CD 



CO ft 



rH <D 



O fd 

+J M 

ft cn 

CD <D 

U -P 

dJ PI 
u 



P! Pi 

0) -rl 

cn Ti 

fd pj 

rH O f 

rH ft 



<U ti 

cn 

•rl -P 



O 4J 

CQ 0 

>, ti 

rH ft 



CO ft ■ 



a) 



CD 

i e 

rH 

o fd 
jj u 
ft Cn 

(U CD 
U 4-> 
<D Pi 

rH *H 



Pi Pi 

a> -h 

tn t3 

fd Pl 

rH O 

rH ft 

o to 

o o 



Pi o 
Cn A 



fd 

e pi 

0 -H 

01 CD 
O -P 

to o 
>, rH 

rH ft 

<D 

CN Pl 

fd 

CD ti 



ro ft - 
Q a> « 
u o 

01 



CD 
& 

ti 

o fd 

4-> ti 

ft Cn 

<D 0) 

U JJ 

d) Pl 
JH 



-rH 

I 



o 
- o 



pj 



2825.1027-001 



-142- 



CO 




0H 




hh 


< 


< 


< 


Ph 


> 


CD 


Q 


fa 


>H 


Eh 


Eh 


CO 








a 


< 


< 


> 




< 


Q 


cn 


CO 


>H 


t-H 


Eh 


E-i 


cj 


< 


< 


IH 


CJ 


CD 


CJ 


CD 


Eh 


CD 


Eh 


H 


Eh 


cj 


< 


o 


cd 


o 


cd 


CJ 


FH 


< 


Eh 


<: 


CJ 


< 


< 


CJ 


CJ 


H 


CD 


CO 


£ 


cn 


CO 


CO 


CO 


CO 


s 


s 


s 








CO 


a 


CO 


ACCTGAAGAG [C/T] GTGATGCTGC 


TCGAGGAGAA [G/C] TCTGGCATGG 


AGTTCCACAG [G/A] AAATACCGGA 


CCTCCTTCCT [G/A] CGGGCACCCA 


CCAGCCGCCT [C/T] TTTGACCAGT 


GGATCCCAGC [T/C] GATGTAGACC 


TAGAGTTGGC [A/G] TTGGAAACAT 


TATTTAGTAG [T/C] GAAACCAAAT 


TCATTATCAC [A/G] CAAGGTAACT 


i 

CGGTCAGTGG [C/T] TTCCAGCCAG 


CATGCTGATG [A/G] TCACACACCT 


AGCCTCCTGA [A/T] GCTGATGGTG 


TACATCAACT [C/T] TTTGGAGATG 


GCTTTGCCTA [C/T] ATTGCCCGCC 


GTCACTGGGA [T/C] TGCTGTAGCG 


GGGAGTACAC [G/A] TGCCAGACTG 


TNNI2 , troponin I, skeletal, fast 


TNNI2, troponin I, skeletal, fast 


CRYAB, crystallin, alpha B 


CRYAB, crystallin, alpha B 


CRYAB, crystallin, alpha B 


CRYAB, crystallin, alpha B 


PIGF, phosphatidyl inositol 
glycan, class F 


PIGF, phosphatidyl inositol 
glycan, class F 


PIGF, phosphatidyl inositol 
jglycan, class F 


TJPl, tight junction protein 1 
(zona occludens 1) 


TJPl, tight junction protein 1 
(zona occludens 1) 


TJPl, tight junction protein 1 
(zona occludens 1) 


SCYA5, small inducible cytokine 
A5 (RANTES) 


SCYA5, small inducible cytokine 
A5 (RANTES) 


FCGR2B, Fc fragment of IgG, low 
affinity lib, receptor for (CD32) 


FCGR2B, Fc fragment of IgG, low 
affinity lib, receptor for (CD32) 
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PDGFA, platelet -derived growth 
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synthetase 1, mitochondrial 
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synthetase 1, mitochondrial 
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synthetase 1, mitochondrial 
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receptor 
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TCCACCCCAG [T/C] GGGGCCATAG 


TTCCTAACAC [A/G] GTGACTGTGG 


CCTGCCCTAT [C/T] ACCGACGCCC 


ATATTGTCCA [T/C] AAGCATGGAG 


GGGACAGAAG [C/T] ACATGACCGC 


1 

TGGTGAAGCT [G/C] TTCGGGCCCT 


TGAAGTACTA [ C/T ] ACCCTAGAGG 


AGAATTTCGT [C/A] GTTGGGAAGT 


CTGTTGATCC [T/C] GATGAACCTG 


TGGATTTCAA [G/T] AATATACCAT 


ACACTTACTC [G/A] GAGTGGCACA 


GGGAAAGTGC [A/G] TGTGAGCGGC 


CTGGCAAAAG [G/T] TGGCCTATTG 


GTTATTTTCT [T/C] CTTACCCTGG 


AATGAAACCA [C/A] ATCCGTGGTT 


CCCTGCACCA [T/C] GCCTTGGAAC 


CTGACGTCTT [T/C] CTGGAGGCAT 


AOC3, amine oxidase, copper 
containing 3 (vascular adhesion 
protein 1) 


A0C3, amine oxidase, copper 
containing 3 (vascular adhesion I 
protein 1) 


AOC3, amine oxidase, copper 
containing 3 {vascular adhesion 
protein 1) 


CTH, cystathionase {cystathionine 
gamma- lyase) 


CYBA, cytochrome b-245, alpha 
polypeptide 


CYBA, cytochrome b-245, alpha 
polypeptide 


CYB5, cytochrome b-5 


UQCRC2, ubiquinol -cytochrome c 
reductase core protein II 


DSC3, desmocollin 3 


jDSC3, desmocollin 3 


|DSC3, desmocollin 3 


GPD2, glycerol -3 -phosphate 
dehydrogenase 2 (mitochondrial) 


GPD2, glycerol -3 -phosphate 
[dehydrogenase 2 (mitochondrial) 


GPD2 , glycerol -3 -phosphate 
dehydrogenase 2 {mitochondrial) 


GRB2 , growth factor receptor- 
bound protein 2 


EYA1, eyes absent (Drosophila) 
homolog 1 


|GYS1, glycogen synthase 1 
I (muscle) 
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rCTGGCAGAA [ C/T ] TTGCGGCTCA 


PGGACAGCCT [G/A] CCCCAGGCAG 


AAGCCACTGT [G/C] GCTTCTGGCA 


CCAGCGACCC [G/A] GCAGGACCTA 


TATGCAGCTG [ C/T ] TACCCTCCAG | 


GGGATCCCAG [C/T] TTTGAGGAGG 


AACCCCAACC [G/A] CGTTCGCATG 


GCTTCCTCAA [T/C] GGGGAGGTGC 


AGCAGAGCCA [G/A] GGCACCTGCA | 


ATGCTCTTCG [G/C] GTGCCTCCAC j 


ATGAGGAGGA [G/A] GAAGAGCCAC | 


GCCTGGCCAA [C/T] GCTGCTGCCT 


GCAGCAGAGT [C/T] GCCACATCAT 


GACGCTTGTG [C/T] TTGCCCTGGC 


ACAGGGAGGT [G/A] GCCGAGATCC 


TCATGCTGGC [T/C] GTGGGAGGAG 


CCATCGCGCT [A/G] GCACTGCTGG 


AGTGAAGATC [A/C] AGAAGACAAG 


ATGATCGTTT [C/T] CTTAGTCAGT 


TAGTAGCAGT [C/T] TTAGAATACA 


1 CTCAGCCCCG [A/C] AGTGCTTCAG 


IGTTGTAATGA [A/G] TTTATAATGG 


ATTTATAATG [G/A] AAGGAACTCT 


|AAGCAGGAAC [T/C] GGCCAAGTAC 


DXTR, oxytocin receptor 


PCK1, phosphoenolpyruvate 
carboxykinase 1 (soluble) 


PGKl, phosphoglycerate kinase 1 


DNA repair protein XRCCl ' 


DNA repair protein XRCCl |' 


DNA repair protein XRCCl | 


DNA repair protein XRCCl |. 


SCTR, secretin receptor | 


SCTR, secretin receptor | 


SHCl ! 


SHCl | 


SLC2A4, solute carrier family 2 
(facilitated glucose transporter) , 
member 4 


SLC2A5, solute carrier family 2 
(facilitated glucose transporter) , 
member 5 


SLC2A5 , solute carrier family 2 
(facilitated glucose transporter) , 
member 5 


SLC2A5, solute carrier family 2 
(facilitated glucose transporter) , 
member 5 


Human (HepG2) glucose transporter 
gene mRNA, complete cds. 


Human (HepG2) glucose transporter 
gene mRNA, complete cds. 
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Isosi 


Isosi 


ISOS ) 


SST, somatostatin 
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kGTATTGTCC [A/G] TATCAGACCT 


CCATTGACAT [G/C ] GCCACGGAAA 


GCTACATTGC [C/T] GAGCAGAACA 


GCTGCAGCAA [A/G] TGCTCGCCGG 


TCTGCACCTG [C/T] AGGCCCGGCT 1 


GATCTGTAAC [G/A] TGGTGGCCAT 


AATGCAAGCA [T/G] GGATGCAGTC 


CCAAGCACCT [C/T] CTTCCTGCTC 


GCCGCTGCCC [G/A] CTCATGCTGA | 


CGCGCTACAG [T/C] CAGCGCCCAG 


TCGGCCTCTA [T/C] GACTCCGTCA 


TGCACCACAG [G/A] AGCCATGGCG 


TACGGGAATC [A/G] CCGTTTTGAA 


ATCCTGACCA [T/C] GGTGCGGACT 


ACTGTGGCAT [C/A] GAGATATACT 


ATATTAACTT [C/G] ATGGCTGCAA 


SST, somatostatin \ 


! 

SUR, sulfonylurea receptor 

(hyper insulinemia) 1 


TKT, transketolase (Wernicke- 
Korsakoff syndrome) 


TNFRSFIB, tumor necrosis factor 
receptor superfamily, member IB j 


TNFRSFIB, tumor necrosis factor 
receptor superfamily, member IB 


TNFRSFIB, tumor necrosis factor 
receptor superfamily, member IB 


TNFRSFIB, tumor necrosis factor 
receptor superfamily, member IB 


TNFRSFIB, tumor necrosis factor 
receptor superfamily, member IB 


TRAP3 


UCP2, uncoupling protein 2 
(mitochondrial, proton carrier) 


UCP2, uncoupling protein 2 
(mitochondrial, proton carrier) 


UCP2, uncoupling protein 2 
(mitochondrial, proton carrier) 


UCP2, uncoupling protein 2 

| (mitochondrial, proton carrier) 


UCP2, uncoupling protein 2 
(mitochondrial, proton carrier) 


IDE, insulin -degrading enzyme 


IDE, insulin- degrading enzyme 
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GGACGACTCC [G/A] AGCTGCCTAC : 


TGGGAGGCAC [G/A] GTGATTGGAA 


TGATTGGAAG [T/C] GCCCGGTGCA 


CGTGGGATCA [C/G] CAATCTCTGT 


CACTGTGGAT [A/G] CCTGGCCCTT 


ATGGCAGCCT [T/C] ACAGGTGCCA | 


CAGCGTTCTT [C/T] GTGACGTTAG 


1 AATATCTCGC [T/C] GTGGAGTCCC 


ACTTCAAACG [G/T] ATGACAGCAC 


CCTCACATAC [G/C] AGGCCTCCAT 


TCCGGGATCT [C/T] AGTAAGCCAG 


AAGTATATCA [T/G] TTTCAAATAT 


GTAGCCATGC [T/C] TACGCAAATC 


PFKM, phosphof ructokinase , muscle i 


PFKM, phosphof ructokinase, muscle ' 


PFKM, phosphof ructokinase, muscle 


PFKM, phosphof ructokinase, muscle 


PFKM, phosphof ructokinase, muscle 


phosphof ructokinase, liver | 


CPT1A, carnitine 
palmitoyltransferase I, liver 


CPT1A, carnitine 
palmitoyltransferase I, liver 


! 

CPT1A, carnitine 
palmitoyltransferase I, liver 


CPT1A, carnitine 
palmitoyltransferase I, liver 


NSMAF, neutral sphingomyelinase 
(N-SMase) activation associated 
factor 


NSMAF, neutral sphingomyelinase 
(N-SMase) activation associated 
factor 


NSMAF, neutral sphingomyelinase 
(N-SMase) activation associated 
factor 
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protein phosphatase 2A, 130 kDa 
recrulatory subunit 1 


protein phosphatase 2 A, 130 kDa 
requlatory subunit 1 


protein phosphatase 2A, 130 kDa j 
requlatory subunit 1 


protein phosphatase 2A, 130 kDa 
regulatory subunit 


IGFIR, insulin- like growth factor 
1 receptor 


IGFIR, insulin- like growth factor 
1 receptor 


IGFIR, insulin- like growth factor 
1 receptor 


IGFIR, insulin- like growth factor 
1 receptor 


IGFIR, insulin- like growth factor | 
1 receptor 


IGFIR, insulin-like growth factor 
1 receptor 


IGFIR, insulin-like growth factor 
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IGFIR, insulin- like growth factor 
jl receptor 


IGFIR, insulin- like growth factor 
1 receptor 


IGFIR, insulin-like growth factor 
1 receptor 


retinoic acid-binding protein II 


retinoic acid-binding protein II 


EMRl, egf-like module containing, 
mucin- like, hormone receptor- like 
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EMRl, egf-like module containing, 
mucin- like, hormone receptor- like 
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TTACTATTGC [A/G] CTTGCAAACA 


TCACCAGCAG [G/C] GTCTGCCCTG 


CTCAGCAAAT [G/A] TCACTCCGGC 


ACACTGGCAT [C/T] TTTTTGGAAA 


GACAACAAGA [C/T] GGGCTGCGCC 


TGCCTCCCTA [C/T] GCCTTCTTCT 


AACGTGAGCC [A/G] GGAGCAGCGT 


ATTGTTTTTA [A/G] GGTGAGAAAT 


CTCGCCGAAC [G/A] ACCCTGTCAC 


TCCTGCCGCT [C/T] GATTTCTCCA 


EMRl , egf-like module containing, 
mucin- like, hormone receptor- like 
sequence 1 1 


EMRl, egf-like module containing, 
mucin-like, hormone receptor-like 
sequence 1 


EMRl, egf-like module containing, 
mucin-like, hormone receptor- like 
sequence 1 


EMRl, egf-like module containing, 
mucin-like, hormone receptor- like 
sequence 1 


EMRl, egf-like module containing, 
mucin-like, hormone receptor -like 
sequence 1 


iEMRl, egf-like module containing, 
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sequence 1 


EMRl, egf-like module containing, 
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sequence 1 


EMRl, egf-like module containing, 
mucin-like, hormone receptor-like 
sequence 1 


RARA, retinoic acid receptor, 
alpha 


retinoic acid receptor, beta 


retinoic acid receptor, beta 


RXRA, retinoid X receptor, alpha 


; RXRA, retinoid X receptor, alpha 
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CCTATCCTTA [C/A] CCTCGGTACA 


TTTCTGCTGC [C/T] GCCAACAACA 
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; TCACCCAGGA [C/T] GCCCAGCTGA 


CGCCACGCGC [G/A] CCTGCGGCCT 
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TGCCTACAAA [C/A] AGGTGAAATT 
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ACTGCCAGGC [G/A] TTCAGTGGCA 
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RXRA, retinoid X receptor, alpha < 


RXRA, retinoid X receptor, alpha i 


RARB, retinoic acid receptor, ! 
beta ! j 


PCSK2 , proprotein convertase 
subtil is in/kexin type 2 


PCSK2, proprotein convertase 
subtil is in/kexin type 2 


PCSK2, proprotein convertase 
subtil is in/kexin type 2 


S" 

-H 

Pi 

•H 

rQ 

-H 

o 

rH 
CJJ 

4J 

cn 
o 
o 

-H 
4J 
Pi 

O pj 
U -H 

rH 

pi 

CD 0 
ffl rH 

cj m 


CBG, corticosteroid binding 
globulin 


CBG f corticosteroid binding 
globulin 


CBG, corticosteroid binding | 
globulin i 


|tP0, thyroid peroxidase 


|TPO, thyroid peroxidase 


DI02, deiodinase, iodothyronine, 
type II 


DI02, deiodinase, iodothyronine, 
type II 


DI02, deiodinase, iodothyronine, 
type II 


Human ret proto-oncogene mRNA for 
tyrosine kinase. 


Human ret proto-oncogene mRNA for 
tyrosine kinase. 


Human ret proto-oncogene mRNA for 
tyrosine kinase. 


Human ret proto-oncogene mRNA for 
tyrosine kinase. 


Human ret proto-oncogene mRNA for 
; tyrosine kinase. 
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ATCTACAAGG [A/G] CTACATCCGG 


GCCACTAGCT [C/T] CTCCGAGAAT 


GTTCCCTTTA [T/G] GATTATCTGA 


GCTAAATCAA [T/C] GCTGAAGTTA 


TATCAGACAG [T/G] GTTGATGAGG 


TCCTTATCAT [G/A] ACCTAGTGCC 


1 

GTTACGCCCC [T/G] CATTCCCAAA 


TGTTGGACGA [G/A] AGCTTGAACA 


AAATTTGGCA [G/A] CAAGCACAAA 


TGGAGTTGCC [A/T] AGATGAATAC 
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GGAGAGGGCT [C/T] TTCACCAATG 
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BMP 7 , bone morphogenetic protein 
7 (osteogenic protein 1) 


BMP7, bone morphogenetic protein 
7 (osteogenic protein 1) 


BMP7 , bone morphogenetic protein 
7 (osteogenic protein 1) 


BMPRIB, bone morphogenetic 
protein receptor, type IB 


BMPRIB, bone morphogenetic | 
protein receptor, type IB 


BMPRIB, bone morphogenetic 
protein receptor, type IB 


BMPRIB, bone morphogenetic 
protein receptor r type IB 


BMPRIB, bone morphogenetic 
iprotein receptor, type IB 


BMPRIB, bone morphogenetic 
protein receptor, type IB 


BMPR2, bone morphogenetic protein 
receptor, type II 
(serine/threonine kinase) 


BMPR2, bone morphogenetic protein 
receptor, type II 
(serine/threonine kinase) 


| CALB1 , calbindin 1, (28kD) 


| calcium- sensing receptor 


j calcium- sensing receptor 


calcium- sensing receptor 


calcium- sensing receptor 


| calcium- sensing receptor 


| calcium- sensing receptor 


calcium- sensing receptor 
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ATGATGTGGG [G/A] CCACCTGGTC 
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GTGGTCTTGG [G/C] CTGTTTGTTT 
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GCGCCAATGC [C/T] TCCTTCACCT 


CAACAAATGC [C/T] TGGAACTGGC 


AGGGTATACT [C/T] CAAGGATGCA 


TRIP15: thyroid receptor 
interacting protein 15 


thyroid receptor interactor 14 


thyroid receptor interactor 8 


PSMC5, proteasome (prosome, 
macropain) 26S subunit, ATPase, 5 


glutamate receptor 3, flip isoform 


glutamate receptor 3, flip isoform 


glutamate receptor 3, flip isoform 


glutamate receptor 3, flip isoform 


glutamate receptor 3, flip isoform 


glutamate receptor 3, flip isoform 


GRM3, glutamate receptor, 
metabotropic 3 


GRM3, glutamate receptor, 
metabotropic 3 


GRM3 , glutamate receptor, 
metabotropic 3 


GRM3, glutamate receptor, 
metabotropic 3 


GRM3, glutamate receptor, 
metabotropic 3 


GRM3, glutamate receptor, 
metabotropic 3 


GRM3, glutamate receptor, 
metabotropic 3 


GAD1, glutamate decarboxylase 1 
(brain, 67kD) 


GAD1 , glutamate decarboxylase 1 
(brain, 67kD) 
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CCTCATGGAA [C/T] AAATAACACT 
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ACACTACCCG [A/G] AGAATTGGTC 
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TGTGAGAGCC [C/T] TGGATGGACC 


AAGTAACAGC [A/G] GTGTTGCCAA 


GGCGAGGAAC [T/C] CTACATCATC 


GCCGTCAGGG [C/A] ATCTCCCCTA 


TTGGAGTTCC [A/G] GCCATGTCTA 


GADl, glutamate decarboxylase 1 
(brain, 67kD) 


GADl, glutamate decarboxylase 1 
(brain, €7kD) 


GADl, glutamate decarboxylase 1 
(brain, 67kD) j 


HTR3, 5-hydroxytryptamine 
(serotonin) receptor 3 


EFNB3 , ephrin-B3 


LAMA2, laminin, alpha 2 (merosin, 
congenital muscular dystrophy) 


LAMA2, laminin, alpha 2 (merosin, 
congenital muscular dystrophy) 


LAMA2, laminin, alpha 2 (merosin, 
congenital muscular dystrophy) 


LAMA2, laminin, alpha 2 (merosin, 
congenital muscular dystrophy) 


LAMA2, laminin, alpha 2 (merosin, 
congenital muscular dystrophy) 


LAMA2, laminin, alpha 2 (merosin, 
congenital muscular dystrophy) 


LAMA2, laminin, alpha 2 (merosin, 
congenital muscular dystrophy) 


LHXl, LIM homeobox protein 1 


LHXl, LIM homeobox protein 1 


|lHX1, LIM homeobox protein 1 


CSPG3 , chondroitin sulfate 
proteoglycan 3 (neurocan) 
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CSPG3, chondroitin sulfate 
proteoglycan 3 (neurocan) 


CSPG3 , chondroitin sulfate ! 
proteoglycan 3 (neurocan) 
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proteoglycan 3 (neurocan) 


CSPG3 , chondroitin sulfate 
proteoglycan 3 (neurocan) 


TLX, tailless homolog 
(Drosophila) 
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TCGAGACTAT [G /A] TTGTGACCAA : 


GAGGAAGCCT [G/C] AAAACAGTGA : 


AAGTAAAAGA [C/T] TTGAAAAAGA 


TCCATGAATA [T/A] CAGCATTTGG | 


CCTGGAGCTA [G/A] GTCCATGAAT 
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STX1A, syntaxin 1A (brain) 


Human B7 mRNA, complete cds. ! 


Human B7 mRNA, complete cds. 


HTR2B, 5-hydroxytryptamine 
(serotonin) receptor 2B 


HTR2B, 5-hydroxytryptamine 
(serotonin) receptor 2B 


TPH, tryptophan hydroxylase 
(tryptophan 5 -monooxygenase) 
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SATTACCTGC [A/C 3 AACAGGAATG ! 


CCTTCTATAC [C/T] CCAGAGCCAG 
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CAGACGGAAA [G/T] TGCTCACACC 
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AGATTACCAA [A/G] CCCAACGTGT 


TGTCCTGCAG [T/G] GACAAGATTG 


GGGTTGGAGG [T/C] CCGTGACTGC 


TPH, tryptophan hydroxylase 
{tryptophan 5 -monooxygenase) |i 


TPH, tryptophan hydroxylase 
(tryptophan 5 -monooxygenase) 


TPH, tryptophan hydroxylase 
(tryptophan 5 -monooxygenase) 
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methyltransf erase 


ASMT, acetylserotonin N- 
methyltransf erase 


ASMT, acetylserotonin N- 
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ASMT, acetylserotonin N- 
methyltransf erase 


ASMT, acetylserotonin N- 
methyl transferase 


ASMT, acetylserotonin N- 
methyl transferase 
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ADAR, adenosine deaminase, RNA- 
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1 ADAR, adenosine deaminase, RNA- 
j specif ic 


ADAR, adenosine deaminase, RNA- 
specif ic 


AD ARB 1 , adenosine deaminase, RNA- 
specific, Bl (homolog of rat REDl) 


ADARB1, adenosine deaminase, RNA- 
specific, Bl (homolog of rat REDl) 


DVL3, dishevelled 3 (homologous 
; to Drosophila dsh) 
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CTCAGGCAAA [T/C] GGAGTAAGCC 


ACCGTCTGAA [C/G] TTGTAGGGAG 


CAAGGGTACC [G/A] CGTCGATGCT 


TTTTGGCTTC [C/G] TGGCCTTTGG 


CCCAGTTCAT [G/T] GATGGTGCCC ! 


CTGTGAGTGG [C/G] ATTTGTTTTG 


GGTCTGCAGT [G/T] GCCACCTGAA 


GTGTAACAGA [A/C] GACGGGGGTT 


AGGCCATCAA [G/C] ATGGGGCAGT 


GTCAACAGTA [A/G] CCTGGTGTGC 


: CCTTCACTTA [T/C] GAGGATCCCA ! 


ACCTGTGTGG [C/T] TCATGCAGAG 


CTTTGGGATA [C/T] TCATGTGGGA 


GAGACCTTCA [C/T] CCTTTACTAC 


! TTTGAGGTGC [A/C] AGGCTCAGCA 


CTATGACCAG [G/A] CAGAAGACGA 


[GGGGCTTTGG [C/G] CTTCCTCCTG 
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GGAGGTCATT [G/C] GGACAGGCTC 


TTCCTCAGGC [A/G] GCGGGAGGGC 


AGCCATTGGA [C/T] TGGAGTGCTA 


AGCTGAACCT [G/C] CTGACAGAGT 


TTTGGTGTTC [G/A] ATAGCATATT 


TGAAGCAAAT [G/C] CAGATACTTC 


I CAMS, intercellular adhesion 
molecule 5, telencephalin 


ICAM5 , intercellular adhesion 
molecule 5, telencephalin 


SOSl, son of sevenless 
(Drosophila) homolog 1 


SOSl, son of sevenless 
(Drosophila) homolog 1 


SOSl, son of sevenless 
(Drosophila) homolog 1 


SM0H, smoothened (Drosophila) 
homolog 


SM0H, smoothened (Drosophila) 
homolog 


SM0H, smoothened (Drosophila) 
homolog 
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LIMK2 , LIM domain kinase 2 


MADH2 , MAD (mothers against 
decapentaplegic, Drosophila) 
homolog 2 


RAD51 r RAD51 (S. cerevisiae) 
homolog (E coli RecA homolog) 
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3GACCTGTCG [C/T] TCATCTGCCT ] 


ACCAGCGGAC [A/G] CTCGACCCCC 


ATCGCAAATG [C/a] ACAGGACAGG 


TCCTGTCTTC [C/t] TGGCAATGTT 


TGTGCACACT [A/g] CCATGTAAGC 


TGTGCTGACT [A/t] CCGGGGTGTC 


AAGCCAGAGG [G/a] GTTCTCAAGT 1 


TCATAGACTA [C/t] GATGAACACA 
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1 TTACCATGTA [T/C] ACCACCTGCA 
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GRM4, glutamate receptor, 
metabotropic 4 1 


GRM4 , glutamate receptor, 
metabotropic 4 


GRM7 , glutamate receptor, 
metabotropic 7 


GRM7 , glutamate receptor, 
metabotropic 7 \ 


GRM7 , glutamate receptor, 
metabotropic 7 


GRM7, glutamate receptor, 
metabotropic 7 


GRM7, glutamate receptor, 
metabotropic 7 


GRM7, glutamate receptor, \ 
metabotropic 7 | 


GRM8, glutamate receptor, 
metabotropic 8 
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metabotropic 8 


GRM8 , glutamate receptor, j 
metabotropic 8 | 


GRM8, glutamate receptor, 
metabotropic 8 


GRM8, glutamate receptor, 
metabotropic 8 


GRM8, glutamate receptor, 
metabotropic 8 


GFRA2, GDNF family receptor alpha 
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GFRA1, GDNF family receptor alpha 
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GFRA1, GDNF family receptor alpha 
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GFRA1, GDNF family receptor alpha 
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NTRK2, neurotrophic tyrosine 
kinase, receptor, type 2 


NTRK2, neurotrophic tyrosine 
kinase, receptor, type 2 
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While this invention has been particularly shown and described with references to 
preferred embodiments thereof, it will be understood by those skilled in the art that 
various changes in form and details may be made therein without departing from the 
scope of the invention encompassed by the appended claims. 



